Jump to content

ADSR Envelope API


rje
 Share

Recommended Posts

I was looking at Concerto:

 

I like that approach to envelope management: envelopes are configurations that any voice can use.

So I've been thinking about a sound API. 

I've thought about a full-fledged API, but I think just an envelope manager with an interrupt is the right way.

 

VOICE ENVELOPES: $0200 thru $020F.  Which envelope each voice is using, if any.
VOICE STATES: $0210 thru $021F.  (Internal) state variable.  Current state ('A','D','S','R', or 0 for 'done') of voice.  Used by interrupt.
VOICE TARGETS: $0220 thru $022F.  (Internal) state variable.  Target value for voice.  Used by interrupt.
ENVELOPE DATA: $0230 and up.  Envelope configurations.  Four bytes per envelope: one each for Attack, Decay, Sustain, and Release, in jiffies or something.

 

Theoretically there can be a whole bunch of envelope configs.  Concerto (above) has three.  Performance testing can determine what a reasonable number is (somewhere between 1 and 16+).

...and then an interrupt which modifies voice volume based on the envelope it's set at.

So each cycle, the interrupt checks each voice state and the voice's current volume:

  • If the current state is 0, then skip.
  • If the current state is 255, then set the current state to 'A' and set up the temporary variables for this voice.
  • If the current state is 'A':
    • increase volume.  steepness is based on attack value.
    • if current volume == target, update state to 'D'.
  • If the current state is 'D':
    • decrease volume.  steepness is based on decay value.
    • if current volume == target, update state to 'S'.
  • If the current state is 'S':
    • decrease target.  steepness is based on sustain value.
    • if target == 0, update state to 'R'.
  • If the current state is 'R':
    • decrease volume. steepness is based on release value.
    • if current volume == 0, update status to 0.

 

Edited by rje
  • Like 1
Link to comment
Share on other sites

I think sound is going to fall into two broad categories: active generation and raw playback.

My sound “engine” in Flappy Bird is the latter. I decided to take that route after analyzing the sound engine for Wolf3d.

This is the lightest possible implementation and leaves the most CPU resources for the rest of your program. My sound play routine actually uses the same stream format and update function for both FM and PSG.

It depends on data streams which already contain all of the ADSR / pitch bending / PWM type of stuff baked in. This means a tool must produce this stream (or in my case you make it by hand).

Look for me on Discord. I’d like to discuss this with some of the other sound aficionados.

Link to comment
Share on other sites

  • 7 months later...

My really cheap proof of concept ADSR code is here: https://github.com/bobbyjim/x16-c-tools/blob/main/PSG.c

Note that the code that "runs the envelope" is not an interrupt.  So it's a POC, not a real usable thing really.

 

Envelopes use:

typedef struct {
  long phase; // along the ADSR envelope
  int attack :16;
  int decay :16;
  int sustain :16;
  int release :16;
  } Envelope;

 

They're linear timeouts.  So, not as versatile as the C64's envelopes.  Still, 2^16 jiffies is... uh, I think it's 18 minutes.

 

65536 jiffies x 1 second / 60 jiffies x 1 minute / 60 seconds = 65536/3600 = 18 min.

Edited by rje
  • Like 2
Link to comment
Share on other sites

  • 1 month later...

@rje What kind of functionality would you think would be good for such a C library? The questions I am asking myself are

  • would you rather set a MIDI note (library handles conversion to frequency) or a 16-bit frequency directly?
  • What envelope parameters are desired? Do we need the full ADSR functionality, or would it be sufficient to define attack, release and hold duration (the time in between attack phase and release phase, potentially zero)  (in other words, dropping the decay phase)
  • Should you pass the PSG voice index (0-15) during the function call, or should the library automatically choose a voice?

I am thinking that the easiest would be to have a single function to which you pass all required parameters, like frequency, ADSR parameters, waveform. So you can call the function and then simply forget about it.

Link to comment
Share on other sites

I think the frequency/MIDI question boils down to the applications you have in mind. The MIDI to frequency conversion is a relatively expensive operation. It's only worth it if the intended use of such a library would be to play music. In all other cases (e.g. sound effects), it is both more efficient and less limiting to directly set the frequency.

Dropping the decay phase makes sense IMO, since in most cases, you either want a short sound (decay = release), or a sustained sound or beep (no decay phase, since you directly enter sustain phase).

Link to comment
Share on other sites

On 1/26/2022 at 2:31 AM, kliepatsch said:

@rje What kind of functionality would you think would be good for such a C library?


...

I am thinking that the easiest would be to have a single function to which you pass all required parameters, like frequency, ADSR parameters, waveform. So you can call the function and then simply forget about it.

Here's the functions I coded up in my proof-of-concept.

void runVoice( unsigned voiceNumber, Voice* voice );
void runVoiceWithEnvelope( unsigned voiceNumber, Voice* voice );
int getTunedNote( unsigned index );
void bang(unsigned frequency);

Each voice has a dedicated envelope, so runVoiceWithEnvelope() can figure out the envelope's address.

 

  • Thanks 1
Link to comment
Share on other sites

On 1/26/2022 at 3:01 AM, kliepatsch said:

Dropping the decay phase makes sense IMO, since in most cases, you either want a short sound (decay = release), or a sustained sound or beep (no decay phase, since you directly enter sustain phase).

I can see that.  And that's a good simplification for an interrupt-driven envelope manager.  Thanks.

Link to comment
Share on other sites

On 1/26/2022 at 3:31 AM, kliepatsch said:

@rje What kind of functionality would you think would be good for such a C library? The questions I am asking myself are

  • would you rather set a MIDI note (library handles conversion to frequency) or a 16-bit frequency directly?
  • What envelope parameters are desired? Do we need the full ADSR functionality, or would it be sufficient to define attack, release and hold duration (the time in between attack phase and release phase, potentially zero)  (in other words, dropping the decay phase)
  • Should you pass the PSG voice index (0-15) during the function call, or should the library automatically choose a voice?

I am thinking that the easiest would be to have a single function to which you pass all required parameters, like frequency, ADSR parameters, waveform. So you can call the function and then simply forget about it.

Decay is if you want pseudo horns, woodwinds, string instruments (including piano) or percussion, though if you wanted to simplify decay, you would go a long way with a decay of 1/4, 1/2 or 3/4 the attack, at the same rate as the attack, so attack level, attack duration, decay type (of four, two bits, also defines sustain level of 100%, 75%, 50% or 25% of attack level), sustain duration, decay duration.

You could, eg, define 7 patches, with each voice having a two-byte ADSR index, the top three bits indicate the selected patch (or ADSR off) and the bottom 13 bits indicating progress through the envelope.

Edited by BruceMcF
Link to comment
Share on other sites

  • Super Administrators
On 1/26/2022 at 12:31 AM, kliepatsch said:

@rje What kind of functionality would you think would be good for such a C library? The questions I am asking myself are

  • would you rather set a MIDI note (library handles conversion to frequency) or a 16-bit frequency directly?
  • What envelope parameters are desired? Do we need the full ADSR functionality, or would it be sufficient to define attack, release and hold duration (the time in between attack phase and release phase, potentially zero)  (in other words, dropping the decay phase)
  • Should you pass the PSG voice index (0-15) during the function call, or should the library automatically choose a voice?

I am thinking that the easiest would be to have a single function to which you pass all required parameters, like frequency, ADSR parameters, waveform. So you can call the function and then simply forget about it.

I think both should note and frequency should be an option. Call the "Play Note" function to pick a frequency based on the note. Call the "Play Frequency" function to play a specific frequency.  If you implement a lookup, then allowing the composer to play a frequency direct is free - you already wrote the routine, so the composer just hits a different entry point. 

There are going to be times when the programmer will want an arbitrary frequency: sound effects, pitch bends, etc. But when playing music, they will just want a note on the scale and don't care what the actual frequency is. You could even enhance this with a global tuning and pitch bend options, making the CX16 even more useful in music environments. 

Quote

What envelope parameters are desired?

You need not just ADSR, but expression (the "dynamics" of a note or passage) is needed for dynamics changes. 

Thinking about ADSR, it's the least common denominator that describes all simple instruments. All instruments have an attack and all have a release. Whether they have a decay varies by instrument, but all instruments have either a decay, sustain, or both. And yes, the decay rate is different than the release rate - especially for stringed instruments. 

image.png.2501cc2f372c2adcb9c615209d497439.png

And we also need to consider crescendos and other dynamics changes. The "Expression" controller affects not only the final volume of the note, but may affect the filter and ADSR curve. This is necessary to allow for dynamics changes, since volume is really supposed to be used for mixing, not dynamics changes. 

Quote

Should you pass the PSG voice index (0-15) during the function call, or should the library automatically choose a voice?

One channel, one voice. But add the ability to quickly change voice parameters, so composers can switch voices when needed. It might be worth trying to dynamically allocate voices, but remember that every byte of overhead for your player is a byte that the application doesn't get. I think this is a case where space is more important than functionality. 

 

  • Like 1
Link to comment
Share on other sites

Well, rje has been talking about an ADSR manager for a while now, and I was interested in his wishes specifically. My impression was that he doesn't want a second Concerto, but rather address a simple problem: currently you cannot make a sound with a single line of code (or two). I may be wrong ...

Link to comment
Share on other sites

  • Super Administrators
On 1/26/2022 at 11:29 PM, kliepatsch said:

currently you cannot make a sound with a single line of code (or two). I may be wrong

Yeah, the Commander definitely needs commands for managing sound. BASIC 7 had a PLAY command, which allowed for not only playing music, but changing audio parameters on the fly. And it has a separate command for playing beeps and sound effects. Both were useful, and the PLAY command was especially nice since it had a sequencer (you could issue a PLAY command and the program would continue immediately, playing the notes in the background.)

The simplest API I can think of looks like this...

adsr_load(string filename, byte program)
would load a sound into a program slot in memory. Then later, you could do: 

adsr_set_program(byte channel, byte program)
to set up the channel, then 

adsr_play_sound(byte channel, int frequency, int duration)
to play an arbitrary tone or 

adsr_play_note(byte channel, byte note)
to play a note using a MIDI note number. The note would continue to play until the program sends

adsr_play_stop(byte channel)
which would stop playback of either sound or note. 

Obviously, there would be a lot more to the API, to allow for editing the parameters of the synthesizer, but those would be the commands 

Those could be implemented as a BASIC extension or included in a C library. Either way, the idea is that all of the setup for an instrument is encapsulated in the set_program function, so programmers can get to playing sounds very quickly. 

 

  • Like 1
Link to comment
Share on other sites

On 1/27/2022 at 1:29 AM, kliepatsch said:

Well, rje has been talking about an ADSR manager for a while now, and I was interested in his wishes specifically. My impression was that he doesn't want a second Concerto, but rather address a simple problem: currently you cannot make a sound with a single line of code (or two). I may be wrong ...

I have code that can set up a voice -- that's the easy part.  What I lack is the thing that "curates" the sound through an envelope in a "fire and forget" manner.

The ADSR manager is the piece that requires working off of an interrupt to "curate" a played note, so to speak.  In my mind, it would be handled by assembly code, since as an interrupt process it should be as efficient as possible.

Envelope_Manager:     ; voice is in X?

    load status, indexed by X
   ...dispatch based on state...

Attack: 

   increase volume by a_ratio,x
   increment status if it's at max_vol and fall through
   else rts

Decay:

   decrement volume by b_ratio,x
   increment status if it's at decay_vol and fall through
   else rts

Sustain:

   increment status if sustain is done and fall through
   else rts

Release:

   decrement volume by r_ratio,x
   turn off voice and mark done if it's at zero_volume

   rts
   

Edited by rje
Link to comment
Share on other sites

On 1/28/2022 at 10:06 AM, rje said:

I have code that can set up a voice -- that's the easy part.  What I lack is the thing that "curates" the sound through an envelope in a "fire and forget" manner.

The ADSR manager is the piece that requires working off of an interrupt to "curate" a played note, so to speak.  In my mind, it would be handled by assembly code, since as an interrupt process it should be as efficient as possible.

Envelope_Manager:     ; voice is in X?

    load status, indexed by X
   ...dispatch based on state...

Attack: 

   increase volume by a_ratio,x
   increment status if it's at max_vol and fall through
   else rts

Decay:

   decrement volume by b_ratio,x
   increment status if it's at decay_vol and fall through
   else rts

Sustain:

   increment status if sustain is done and fall through
   else rts

Release:

   decrement volume by r_ratio,x
   turn off voice and mark done if it's at zero_volume

   rts
   

I see it as being two parts: the part in the custom IRQ like you describe, plus setting a flag bit. Then your main program looks for that flag, and if it's set sends all the voice volume data to VERA'S PSG registers and clears the flag.

Link to comment
Share on other sites

On 12/14/2021 at 7:02 PM, rje said:

... They're linear timeouts.  So, not as versatile as the C64's envelopes.  Still, 2^16 jiffies is... uh, I think it's 18 minutes.

65536 jiffies x 1 second / 60 jiffies x 1 minute / 60 seconds = 65536/3600 = 18 min.

The maximum SID length is roughly (depending PAL or NTSC clock) 24 seconds (for both delay and release), which is 1,440 jiffies. For attack it's 8 seconds which is 480 jiffies.

So for linear jiffies spanning SID's maximum durations, you need 9, 11 and 11 bits (being linear, they don't have precise matches for each of the 0 to 15 values in a SID ADSR setting).

Then you need two levels: maximum volume (which with the attack length defines the attack increment), and sustain volume as a fraction of maximum volume (which combined with maximum volume and decay length defines the decay increment, and which with maximum volume and release length defines the release increment).

Sustain volume can easily be a four bit value which is a fraction relative to maximum volume, as in the SID.

So if maximum volume is a note setting (along with waveform and frequency), the linear ADR in jiffies covering the maximum lengths of the SID ADSR values combined with Sustain in "n+1" sixteenths requires 35 bits, which packs to 5 bytes, leaving up to five bits for additional information.

Link to comment
Share on other sites

  • Super Administrators
On 1/28/2022 at 9:06 AM, rje said:

I have code that can set up a voice -- that's the easy part.  What I lack is the thing that "curates" the sound through an envelope in a "fire and forget" manner.

The ADSR manager is the piece that requires working off of an interrupt to "curate" a played note, so to speak.  In my mind, it would be handled by assembly code, since as an interrupt process it should be as efficient as possible.

Yes, you need every cycle you can get, and honestly, this isn't hard to do. It's all simple integer math and loops. 

So obvously you need a Stage flag, along with the Attack Rate, Decay Rate, Sustain Level, and Release rate. You also are going to need "Current Level", "Channel Volume", and "Expression Volume" parameters. 

I'm making some assumptions:

  • Stage is a bit mask
  • Attack, Decay, and Release are byte values. 
    • 0=slowest, 255=fastest. Max time is 4.25 seconds (or 1092 seconds with 16-bit counters)
    • Value is change per tick.  Larger number = faster attack/decay/release
  • Sustain and Current Level are byte values. 0=silent, 255=loudest
  • Channel Volume and Expression are inverse byte value. So 0=loudest, 255=silent

Stage:
0: Not active
1: Note start
2: Attack
4: Decay
8: Sustain
16: Release

So you can check the current stage with an LSR/BCS cycle, like this: 

LSR
BCS StartNote
LSR
BCS Attack
LSR 
BCS Decay
LSR 
BCS Sustain
LSR
BCS Release
JMP SetNoteVolume

  • StartNote would initialize the sound generator to the right frequency and set the volume to the starting level.
  • Attack would increment the volume by AttackRate each tick. So if AttackRate is 1, it takes 4.25 seconds to reach full volume. So higher values attack faster, with 255 being the shortest attack rate (one tick.) 
    • Add Level and AttackRate 
    • Is Volume = 255 (or 65535)
      • Yes, set Stage to Decay
    • Jump to SetNoteVolume
  • Decay subtracts DecayRate from volume
    • If volume <= SustainLevel
      • Set volume to SustainLevel
      • Set Stage to Sustain
    • Jumps to SetNoteVolume.
  • Sustain is just a test against "note on". When this changes to "note off", advance Stage to Release. 
  • Release subtracts ReleaseRate.
    • If Volume <= 0 set Stage to 0

And SetNoteVolume is where you set the final output levels, after accounting for channel volume and expression volume. I don't know how the PSG handles channel volume, but the formula is: Output = Oscillator_Level * Expression * Volume, where all values are real numbers between 0 and 1, inclusive. 

In other words, if your oscillator level is currently 0.8, your volume is 0.75, and your expression is 0.75, the final output level is actually 0.45. 

Obviously, you don't want to do floating point math to make this work, so I suggest storing channel volume and expression as attenuation levels. Attenuation is a fancy word for "subtraction", so your formula would be Output= Oscillator_Level - Expression - Volume

So Channel Volume and Expression would be loudest at 0 and quiet at 255. 

This sounds like a lot, but each parameter is important:

  • Channel Volume is like the big fader on a mixer. This adjusts the instrument's overall value in a mix. This is the relative volume of one instrument compared to the others, so channel volume is usually fixed for the duration of the passage.
  • Expression is the volume for that note. Expression will change over time as the dynamics of the music change. It will also go up when an instrument needs to be out front for a solo and back down when an instrument needs to blend for harmonies.
  • Program Volume (or patch volume) is the volume for a specific patch. This is based on how loud this sound is, relative to the other patches in your sound bank. As one of the final stages of editing a patch, you'll adjust this so all of your patches have a similar volume level. This means a track won't get louder or quieter when you change instruments. 

Obviously, this could be simplified for simple sound effects. By starting with 0 for both Volume and Expression, the sound engine would play all sounds at max level. Then the application programmer can decide whether to reduce the sounds for mixing audio by setting those values to a non-zero number for mixing and dynamics purposes... or, perhaps, to have loud sounds be closer to the player character and quieter sounds be further away. 

 

Link to comment
Share on other sites

On 1/28/2022 at 7:36 PM, TomXP411 said:
  • Attack would increment the volume by AttackRate each tick. So if AttackRate is 1, it takes 4.25 seconds to reach full volume. So higher values attack faster, with 255 being the shortest attack rate (one tick.) 
    • Add Level and AttackRate
    • Is Volume < 255? 
      • Yes, set Stage to Decay
    • Jump to SetNoteVolume
  • Decay subtracts DecayRate from volume. It then tests the volume against the SustainLevel, then jumps to SetNoteVolume
  • Sustain is just a test against "note on". When this changes to "note off", advance Stage to Release
  • Release is the same as Decay, except the final level is 0. When level is 0, you can set the stage back to 0. 

I presume that in

  • Quote

     

    • Is Volume < 255? 
      • Yes, set Stage to Decay

     

That either the "<" is an "=" or the if "Yes" is actually an "if no".

Note that linear attack settings can readily be "n+1" by simply using SEC rather than CLC when performing the addition, and then it is "Is volume = 256", which is BCS. Then the slowest attack is "0" rather than "1".

 

Link to comment
Share on other sites

  • Super Administrators
On 1/28/2022 at 4:31 PM, BruceMcF said:

Sustain volume can easily be a four bit value which is a fraction relative to maximum volume, as in the SID

The PSG volume is 6 bits, so the sustain should be a minimum of 6 bits. However, to simplify programming, it's probably simpler to start with an 8-bit volume and shift that right 2 bits to get the output volume. 

Yeah, there's enough room for 16-bit math. With my example above, you'd just extend the values to 16 bits by adding 8 bits to the right, then using the high byte for the final volume calculation. 

Assume A has the final level and , after volume attenuation: 
AND $00111111
ORA ChannelMask,Y  ; the top 2 bits of volume register are the stereo channels.
STA [PSG register 2]


 

Link to comment
Share on other sites

  • Super Administrators
On 1/28/2022 at 5:19 PM, BruceMcF said:

I presume that in

  •  

That either the "<" is an "=" or the if "Yes" is actually an "if no".

Note that linear attack settings can readily be "n+1" by simply using SEC rather than CLC when performing the addition, and then it is "Is volume = 256", which is BCS. Then the slowest attack is "0" rather than "1".

 

You're right. I still haven't internalized 6502 assembly enough to remember some of the finer points. 

Link to comment
Share on other sites

On 1/28/2022 at 5:36 PM, TomXP411 said:

LSR
BCS StartNote
LSR
BCS Attack
LSR 
BCS Decay
LSR 
BCS Sustain
LSR
BCS Release
JMP SetNoteVolume

Alternatively, suppose the Stage byte is in zero page. I'll use address 7F for this example:

BBS0 7F, StartNote 

BBS1 7F, Attack

BBS2 7F, Decay

BBS3 7F, Sustain

BBS4 7F, Release

JMP SetNoteVolume

 

This saves 6 cycles.

Edited by Ed Minchau
  • Like 1
Link to comment
Share on other sites

I think a good way to go about it might be:

Store a set of 6.2 fixed-point values: volume, delta-volume
Store a set of 6-bit (6 LSB) sustain_level values

Set VERA address = $1F9C2 (volume register of voice 0), stride = 4
Cycle through all 16 voices:
if delta=0: read VERA data (to advance the stride) and loop to next voice
volume += delta
volume >> 2 (we'll call it 'level' now to distinguish from the 6.2 fixed-point value 'volume')
if level = sustain_level, delta = 0
if level = 0, delta = 0
if level = $3F then delta = voice's decay rate

write_vera: (label)
level |= $C0 (to enable L+R bits)
store level to VERA data (note that we do this even if the integer part didn't change - it's faster to just do it than to check first)
next voice

Triggering a note sets delta = voice's attack rate, sets volume = 0
Releasing a note sets delta = voice's release rate

This may need a little touching up to handle overflow in case the rates aren't ones that would evenly result in levels of $3F or $00.....

EDIT:
instead of comparing level to 0 and $3F, sta ZP_tmp (assuming A holds the computed 'level' after >>2) and then BIT ZP_tmp
BMI -> set delta=0, level=0, volume=0
BVC -> set delta=decay_rate, level=$3F, volume=$FC
Then do BIT delta and BPL to write_vera
else cmp level $3f... BCS to write_vera
else level=sustain_level, volume = level<<2, delta=0
write_vera:
... (same as initial code above)

Edited by ZeroByte
  • Like 1
Link to comment
Share on other sites

On 1/28/2022 at 9:46 PM, Ed Minchau said:

Alternatively, suppose the Stage byte is in zero page. I'll use address 7F for this example:

BBS0 7F, StartNote 

BBS1 7F, Attack

BBS2 7F, Decay

BBS3 7F, Sustain

BBS4 7F, Release

JMP SetNoteVolume

 

This saves 6 cycles.

Alternatively, suppose it is stored at any specific address, the four Key-On, Note On, Attack, Decay, Sustain are Bit5=1, so stage 0-3 in Bit6/Bit7, the four Key-Off stages are No Note, Attack, Decay, Release, so Bit5=0, release Bit7=Bit6=1 is Release, bit5=bit6=bit7=0 is note off, and Attack and Decay complete.

LDA #%00100000
BIT ADSRSTAGE
BMI DECAY
BVS ATTACK
BNE NOTEON
RTS ; All bits zero, note off
NOTEON:
; Note on code here
LDA ADSRSTAGE
ORA #%0100
STA ADSRSTAGE
; first Attack
ATTACK:
; Attack Code here
; when advance to sustain, "LDA ADSRSTAGE : CLC : ADC #$40 : STA ADSRSTAGE"
RTS
DECAY: BVS RELEASE
; DECAY code here
; when advance to sustain, "LDA ADRSTAGE : ORA #%01000000 : STA ADRSTAGE"
; also when advance to sustain, set the sustain level (since test is PASSING the level)
RTS
RELEASE:
BNE SUSTAIN ; Sustain if Key is still on
; Release code here
; Advance to note off by "LDA ADSRSTAGE : AND #%00011111 ; STA ADSRSTAGE
SUSTAIN: ; sustain code is "no, you're good"
RTS

_______________________

Edit: Note that if you prefer "Note On/Off" in the high bit, then the stage in bits 5 and 6 work equivalently ...

LDA #%00100000
BIT ADRSTAGE
BVS DECAY
BNE ATTACK
BMI NOTEON
...
DECAY: BNE RELEASE
...
RELEASE: BMI SUSTAIN
...

 

Edited by BruceMcF
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...

Important Information

Please review our Terms of Use