Jump to content
  • 0

Low framerate when mixing two PCM streams


Dacobi
 Share

Question

Hi

I have a core loop where I wait for VSYNC and then update sound and screen.

The sound is 8bit mono PCM

     SEI();
     VSYNC();     

    ii = 0;

   while(ii < 256){
      if((VERA.audio.control & 0x80)){break;}
      ii++;

      if(audio_2nd){   
            adder = p[l] + p2[l];
            VERA.audio.data = adder;         
      } else { 
          VERA.audio.data = p1[l];;
     }   
     l++;
     if(l > 4095) {l=0; } 

   }

 

When just playing the one sample everything runs fine. But when adding two samples the game/framerate slows down a lot.

How can this be when just adding two char values?

The screen part of my core loop is scrolling two layers of tiles and one sprite.

 

Link to comment
Share on other sites

12 answers to this question

Recommended Posts

  • 0

Your code waits for VSYNC, then does its work.  It finishes just in time to catch the next VSYNC.  If you add a little bit of code, then it can't finish in time -- it just misses the next VSYNC.  Maybe, your code must wait for the second next VSYNC!  It does nothing for almost an entire frame!  Naturally, its frame rate suffers a huge hit.

(Try not waiting for VSYNC.  See what that does to the rate.)

  • Thanks 1
Link to comment
Share on other sites

  • 0
  • Super Administrators
On 1/8/2022 at 1:10 AM, Dacobi said:

Hi

I have a core loop where I wait for VSYNC and then update sound and screen.

The sound is 8bit mono PCM

     SEI();
     VSYNC();     

    ii = 0;

   while(ii < 256){
      if((VERA.audio.control & 0x80)){break;}
      ii++;

      if(audio_2nd){   
            adder = p[l] + p2[l];
            VERA.audio.data = adder;         
      } else { 
          VERA.audio.data = p1[l];;
     }   
     l++;
     if(l > 4095) {l=0; } 

   }

 

When just playing the one sample everything runs fine. But when adding two samples the game/framerate slows down a lot.

How can this be when just adding two char values?

The screen part of my core loop is scrolling two layers of tiles and one sprite.

 

You might look at the generated assembly code to see if the compiler is doing something stupid. 

Which compiler are you using? 

Also, have you tried removing the "adder" variable and just doing this?
    VERA.audio.data = p[l] + p2[l];

I suspect there may be something funny happening in the background. The compiler might be creating and destroying adder unnecessarily, or it might be creating it as a 16-bit number or something else. Not having the declaration for "adder", it's hard to know. 

Link to comment
Share on other sites

  • 0
On 1/8/2022 at 10:18 AM, TomXP411 said:

    VERA.audio.data = p[l] + p2[l];

This has the same result.

Adder was defined as a char.

I think this is most of the audio loop, but don't understand it.

    jsr     _VSYNC
   ldx     #$00
   lda     #$00
   ldy     #$61
   jsr     staxysp
   jmp     L0798
L077F: ldx     #$00
   lda     $9F3B
   ldx     #$00
   and     #$80
   stx     tmp1
   ora     tmp1
   jeq     L0784
   jmp     L0780
L0784: ldy     #$62
   jsr     ldaxysp
   sta     regsave
   stx     regsave+1
   ina
   bne     L0789
   inx
L0789: ldy     #$61
   jsr     staxysp
   lda     regsave
   ldx     regsave+1
   ldy     #$0A
   lda     (sp),y
   dey
   ora     (sp),y
   jeq     L078A
   ldy     #$05
   jsr     ldaxysp
   jsr     pushax
   ldy     #$6C
   jsr     ldaxysp
   jsr     tosaddax
   ldy     #$00
   jsr     ldauidx
   jsr     pushax
   ldy     #$05
   jsr     ldaxysp
   jsr     pushax
   ldy     #$6E
   jsr     ldaxysp
   jsr     tosaddax
   ldy     #$00
   jsr     ldauidx
   jsr     tosaddax
   ldx     #$00
   sta     $9F3D  ;write audio
   jmp     L0791
L078A: ldy     #$03
   jsr     ldaxysp
   jsr     pushax
   ldy     #$6C
   jsr     ldaxysp
   jsr     tosaddax
   ldy     #$00
   jsr     ldauidx
   sta     $9F3D  ;write audio
L0791: ldy     #$6A
   jsr     ldaxysp
   sta     regsave
   stx     regsave+1
   ina
   bne     L0797
   inx
L0797: ldy     #$69
   jsr     staxysp
   lda     regsave
   ldx     regsave+1
   ldy     #$6A
   jsr     ldaxysp
   cmp     #$00
   txa
   sbc     #$10
   bvs     L079A
   eor     #$80
L079A: asl     a
   lda     #$00
   ldx     #$00
   rol     a
   jeq     L0798
   ldx     #$00
   lda     #$00
   ldy     #$69
   jsr     staxysp
L0798: ldy     #$62
   jsr     ldaxysp
   cmp     #$00
   txa
   sbc     #$01
   bvc     L0783
   eor     #$80
L0783: asl     a
   lda     #$00
   ldx     #$00
   rol     a
   jne     L077F
L0780: ldx     #$00
   lda     #$00
   ldy     #$5B
   jsr     staxysp
   ldx     #$00
   lda     #$00
   ldy     #$59
   jsr     staxysp
   ldy     #$2F
   jsr     ldaxysp
   ldy     #$20
   jsr     staxysp
   ldy     #$1E
   jsr     ldaxysp
   sta     regsave
   stx     regsave+1
   ina
   bne     L07A4
   inx
L07A4: ldy     #$1D
   jsr     staxysp
   lda     regsave
   ldx     regsave+1
   lda     #$00
   jsr     _joy_read
Link to comment
Share on other sites

  • 0
Posted (edited)

I just added -O to my build script and that seems to have fixed the problem.

(edit) but then just adding 

audio_2nd_i++;

where audio_2nd_i is an int and the problem is back.

Edited by Dacobi
Link to comment
Share on other sites

  • 0

Adding the line int i++ to the audio loop is enough to kill the framerate.

Looking at the assembler code the only difference is this snippet: 

    ldy     #$08
    jsr     ldaxysp
    ina
    bne     L0796
    inx
L0796:    ldy     #$07
    jsr     staxysp
    ldy     #$6A
    jsr     ldaxysp 

As far as I can tell it just gets the int variable from stack, increases it and stores it again?

 

Link to comment
Share on other sites

  • 0
  • Super Administrators

It depends on what's going on at ldaxysp and staxysp. If that's dealing with the stack, it's an expensive operation. (The most expensive operations on the CPU are stack operations, since they affect multiple memory locations at the same time.)

Declaring your counters as Zero Page variables can help, as well.

Which compiler are you using? 

  • Like 1
Link to comment
Share on other sites

  • 0
On 1/9/2022 at 10:14 AM, TomXP411 said:

Declaring your counters as Zero Page variables can help, as well.

Which compiler are you using? 

I'm using cc65.

Can you declare a variable as zero page in C?

Link to comment
Share on other sites

  • 0
On 1/8/2022 at 12:19 PM, Dacobi said:

Adding the line int i++ to the audio loop is enough to kill the framerate.

Looking at the assembler code the only difference is this snippet: 

    ldy     #$08
    jsr     ldaxysp
    ina
    bne     L0796
    inx
L0796:    ldy     #$07
    jsr     staxysp
    ldy     #$6A
    jsr     ldaxysp 

As far as I can tell it just gets the int variable from stack, increases it and stores it again?

Yes, this is the expensiveness of a general stack frame stack in the 6502 family. If it was a 256 deep integer stack implemented as a split byte pushdown X-stack, and i++ is item #4 (zero base) on the stack, it's just:

    LDX TOS
    INC STLO+4,X
    BNE +
    INC STHI+4,X
+   ...
 

  • Like 1
Link to comment
Share on other sites

  • 0
On 1/10/2022 at 12:00 PM, Greg King said:

Maybe, your code must wait for the second next VSYNC!  It does nothing for almost an entire frame!  Naturally, its frame rate suffers a huge hit.

Yes, sorry, I already changed that a few days ago. But at the time it didn't change anything.

I would like to setup a custom IRQ handler but haven't looked into doing that in C.

If I disable VSYNC altogether the framerate increases a lot but then the screen is torn.

 

Now my loop is:

while(notdone){

   //process audio

   //process input

   SEI()

   VSYNC()

   //update screen

   CLI()

}

 

My Audio loop is:

VERA.audio.rate = 0;
if(2nd_audio){

           while(!(VERA.audio.control & 0x80)){ //while(ii<256){ 

               VERA.audio.data = p[l] + p2[l];
               //ii++;
               l++;
               if(l > 4095) {l=0;}

          }

} else {

         while(!(VERA.audio.control & 0x80)){ //while(ii<256){

               VERA.audio.data = p[l];
               //ii++;
               l++;
               if(l > 4095) {l=0;}

          }

}

VERA.audio.rate = 64;

If I uncomment ii++ there's still some slowdown.

Link to comment
Share on other sites

  • 0

If you're dealing in PCM, you're probably going to need to make that part of the code using assembly. I recently implemented a non-mixing PCM streaming routine in my zsound library (that I'm planning to share soon with the community). Just a single stream that does nothing but write into the FIFO takes quite a bit of CPU if you're not careful.

For instance, my initial implementation took over 40k clock cycles (about 33% CPU) just to move the data into the FIFO for a 22K/16bit/stereo sound. After a few re-works, I got it shaved down to ~16k CPU cycles for the same stream, but this used several tricks like self modifying code and loop unrolling. If you're mixing 2 streams in C, that could be taking up most of your CPU.

Another unfortunate truth right now is that the current Keyboard polling routine in R38 / R39 is VERY slow (due to the nature of PS/2) - just pressing a couple of keys causes the VSYNC IRQ handler to run well into the visible frame ~20% or more of the way) 

 

  • Like 1
Link to comment
Share on other sites

  • 0
On 1/10/2022 at 12:00 PM, Greg King said:

(Try not waiting for VSYNC.  See what that does to the rate.)

I just moved the screen update to a custom interrupt handler and that increased the framerate and audio processing quite a lot.

  • Like 2
Link to comment
Share on other sites

  • 0

Some cc65 tricks that will shorten pieces of your code:
Change

               l++;
               if(l > 4095) {l=0;}

to

               l = (l + 1) & 4095;

Change

         while(!(VERA.audio.control & 0x80)){ //while(ii<256){

to

         while ((signed char)VERA.audio.control >= 0) { //while(ii<256){

  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...

Important Information

Please review our Terms of Use