Jump to content
Lorin Millsap

DMA guidelines

Recommended Posts

I figured now that I let out some information about the expansion bus and I’m sure there will be hardware enthusiasts eager to develop expansion cards, j figured some guidelines need to be laid out.

 

Before we can get into how to do DMA I want to make sure I define it as other systems may do it differently or apply a slightly different meaning to it. DMA stands for Direct Memory Access and will refer to any operation where the CPU is yielding control to another device. It is not the same as wait states, and it is not the same as bus sharing. Also and device that passively watches the bus but does not interfere or alter that bus in any way is not DMA.

 

So for starters I’ll explain the means by which DMA can be implemented. There are several lines on the expansion bus that are central to DMA, pin 8 which is RDY, and pin 12 which is BE. Also _ML(b) pin5 and SYNC pin 10.

 

So what do these pins do? Starting with RDY this line is normally high and when it is high it allows the CPU to run. When this line is pulled low the CPU will halt in its current state. On its own this is how wait states can be implemented, but it requires its use in conjunction with the BE line in order to act as DMA. The BE or Bus Enable line is also normally high and when it is high it means the CPU is driving the address lines and during a write operation, the data lines as well. Pulling the BE line low will cause the CPUs output lines to go into a high impedance state, or in simple terms, it turns them off so that the CPU no longer influences the state of the address or data lines (and a few other outputs as well). So if you pull both the RDY and BE lines low, the CPU effectively becomes disable and thus allows another device to take full control of the system bus. _MLB is an active low signal and it stands for Memory Lock and when this line is pulled low it means it’s not safe for a DMA operation. The SYNC pin is normally low but goes high when the CPU is ready to fetch an Opcode. This pin could be used for certain DMA operations, but will usually be left unused.

 

So just some examples of what you could use this for include debugger hardware, block copy functions, high speed disk controllers, high speed network controllers, coprocessors, compatibility cards, freeze/replay cards, etc.

 

Now it may seem like you can do just about anything with this, there are some things to be made aware of and consider and that is a large part of why I’m writing this guide. So let’s get into the things to be considered before DMA can be implemented.

 

1. There is no real onboard DMA arbitration. What does this mean? It means the X16 has no onboard means to prevent DMA conflicts. So as an example if you had a Z80 card and it happens to be performing a DMA access to VERA and in the middle of that another DMA card tries to perform an operation those two will end up in conflict garbled data sent to unintended addresses. My recommendation is that all DMA capable cards check the state of the _ML, RDY, and BE lines before attempting a DMA operation. If any of these are low it means that memory is not clear or another DMA operation is in progress. the card should wait until they are both high on the next high phase system clock. This solution is probably not perfect but it should help avoid most conflicts. When a card is performing a DMA operation it should pull _ML (MLB) low time indicate it is performing an operation.

 

2. DMA devices should be inherently disabled and only assert control when enabled. To clarify what this means is that with limited exception DMA cards should never perform DMA operations without first being granted permission by the host typically by setting a flag. Examples could be a disk controller which could inject data directly into system memory. Such a controller should not do so until the host indicates that it is safe to do so by setting a execute flag. If a card needs to get the attention of the host it can do so using the _IRQ line.

 

3. DMA devices need to respect the RESET line. A reset should restore the devices back to their default state and these devices should not get stuck in a DMA state. If a RESET occurs during a DMA operation the operation must be aborted.

 

4. DMA devices need to respect the speed ratings of all main board components that they can access. These specifications have yet to be more clearly defined and we will release timing charts as we get closer to a finished design.

 

5. While DMA devices do not necessarily have to operate on the host clock, they must not interfere with the system clock and in most cases this clock needs to be involved in timing the RDY and BE signals. Also read and write operations do need to meet the system clock requirements.

 

I will edit this post with relevant information and look forward to discussion and questions regarding it.

 

 

Sent from my iPhone using Tapatalk

  • Like 1

Share this post


Link to post
Share on other sites
8 hours ago, Lorin Millsap said:

My recommendation is that all DMA capable cards check the state of the RDY and BE lines before attempting a DMA operation. If either is low the card should wait until they are both high on the next high phase system clock. This solution is probably not perfect but it should help avoid most conflicts.

This will not work reliably.  If multiple cards can take the bus at the same time they definitely will.

Without hardware arbitration there isn’t much choice except for software to enable only one DMA capable card at a time.

I can think of a few other possible hazards but I want to mull over some details before bringing them up.

  • Like 1

Share this post


Link to post
Share on other sites
This will not work reliably.  If multiple cards can take the bus at the same time they definitely will. Without hardware arbitration there isn’t much choice except for software to enable only one DMA capable card at a time.

I can think of a few other possible hazards but I want to mull over some details before bringing them up.

 

Hence why DMA cards need to provide at least some stuff on their own. There is a reason I made this point number 1. It wasn’t to say hey we’d like your feedback on how to accommodate this. This is a big warning that the X16 doesn’t provide arbitration so heads up. We aren’t going to provide DMA arbitration. Not happening. DMA capable cards should make at least some effort by checking first. Is a clash still possible? Yes. But no DMA card should work autonomously. This method should work well enough, many multi master type setups do the same.

 

Your second paragraph is literally what my second rule states. So in a hypothetical let’s say you have a Z80 card that needs access to the host IO. It has several approaches. You could have it put in a request in some set of registered or shared memory then trigger an IRQ on the host and the host handles it, or you can have DMA. If you then had say a network card, it would stand to reason that the Z80 card would not know how to use the network card anyway. So the network card should be disabled anyway. Same to a coprocessor card or a blitter engine, etc. all those would need to be activated by software that knows how to use them. None of these would ever act autonomously.

 

I’ll hear you out but reality is these hazards need to be considered by the card designer, it is not in our scope to accommodate them.

 

 

Sent from my iPhone using Tapatalk

  • Like 1

Share this post


Link to post
Share on other sites
2 hours ago, Lorin Millsap said:

I’ll hear you out but reality is these hazards need to be considered by the card designer, it is not in our scope to accommodate them.

But what good is a retrocomputer without PCI Express and Thunderbolt?😜

  • Haha 1

Share this post


Link to post
Share on other sites
9 hours ago, Lorin Millsap said:

I'll hear you out but reality is these hazards need to be considered by the card designer, it is not in our scope to accommodate them.

Of course.  I guess my comments were not clear because I did not mean to imply you should change any hardware.  But you should be more explicit that software must enable DMA cards only one at a time.  I inferred from your comments that you think sampling RDY/BE is good enough to prevent (most) conflicts between competing requestors.  It is not.

Now, on to other hazards for expansion card designers to consider:

Read-modify-write operations are meant to be atomic.  Expansion cards should not take the bus during these.  The 6502 marks RMW cycles with MLB=0.

It is probably not safe to take the bus during cycles to certain IO spaces.  Writes to Vera auto-increment addresses come to mind.  Might two increments end up happening if one of these cycles is paused?  Maybe the safe route is to take the bus only during opcode fetch (when SYNC=1).  When doing this MLB can be ignored.

Share this post


Link to post
Share on other sites
Of course.  I guess my comments were not clear because I did not mean to imply you should change any hardware.  But you should be more explicit that software must enable DMA cards only one at a time.  I inferred from your comments that you think sampling RDY/BE is good enough to prevent (most) conflicts between competing requestors.  It is not.

Now, on to other hazards for expansion card designers to consider:

Read-modify-write operations are meant to be atomic.  Expansion cards should not take the bus during these.  The 6502 marks RMW cycles with MLB=0.

It is probably not safe to take the bus during cycles to certain IO spaces.  Writes to Vera auto-increment addresses come to mind.  Might two increments end up happening if one of these cycles is paused?  Maybe the safe route is to take the bus only during opcode fetch (when SYNC=1).  When doing this MLB can be ignored.

Good points. I will try to change to wording just to try to clarify that DMA cards must be enabled one at a time. That part will largely be a software side issue not a hardware issue. This is a good point to discuss and clarify. Perhaps a good way to specify is that all DMA operations need to be activated by setting a flag host side. This would mean DMA devices would never just start unannounced. That should minimize the potential for conflicts.

 

Good point on read/write/modify operations. I will update the guide accordingly. It also occurred to me that we can pull _ML during DMA so that one card can let the rest of the system know that an operation is in progress.

 

As to accessing IO space there are times when you would want DMA to alter IO. Avoiding conflicts there is going to largely be a software side issue, if you don’t want a VERA auto increment being disrupted, don’t invoke a DMA operation during a VERA access operation. If DMA based systems are designed so they have to be activated host side that makes it easy to avoid certain conflicts in software.

 

 

Sent from my iPhone using Tapatalk

Share this post


Link to post
Share on other sites

So one exception case I can bring up for a DMA capable card what could take control without CPU is again part of a disk controller card. See a disk controller that’s more advanced than the onboard SD card would be useless if the KERNAL doesn’t know about it. That means to use it you either need to reflash the KERNAL with one that supports the card, or you need to load the KERNAL patch from the onboard SD with each restart, or the card needs a way to inject the relevant code into the host.

 

The first two options do have their own respective drawbacks, but are viable. The third option is the one I want to focus on however as it would require DMA. See the default IO ranges are only 32 bytes in size and this is simply too small for a ROM. Now you could do a type of indirect addressable ROM, but the KERNAL would have to know it’s there and how to use it and while it could be a viable option, it probably not a very good one. But an option that circumvents all memory limitations and can support fully automated loading is DMA injection.

 

So the method here is something like a disk controller would have a microcontroller on it to handle something like SATA anyway. This microcontroller could take over the system bus following a reset and inject the proper patches into RAM. There is obviously a risk that these patches could overwrite something, but that can be addressed in software and these patches would redirect the vector table to the new routines before returning to the stock routines. So the code would probably get placed in an area reserved for KERNAL use anyway.

 

Since this method would not get triggered by the CPU it would need to have a way to be disabled, likely a jumper or switch. As the KERNAL matured we can come up with a clean method for KERNAL extensions to load without conflicting with each other.

 

Also since in theory you could have more than one such card (that auto injects code) so such a card would probably need something that adjusts how long it waits after a reset before attempting DMA so that you don’t get two cards trying at the same cycle. So in this case if the delay (priority) is different the second card will detect that the _ML line is low and will not attempt DMA until it is clear. So the first card would be able to complete its injection before the second card would be allowed to operate.

 

Now I’m not saying we are designing such a card, I’m just throwing out a use case for the sake of discussion because it’s in use cases that we can figure out if it’s even practical. Because if we are honest loading a few files off the SD card to do the same thing is not a deal breaker. The drawback off loading the patches off the SD card is it would require the appropriate SD card with the files be inserted. By using the DMA code injection the expansion card becomes bootable on its own. You could also NetBoot from a network card.

 

 

Before it gets brought up again, the expansion slots are not meant to be used as cartridges, as anyone can see using DMA is not exactly simple. Software should be distributed on full sized SD cards. Think of this as a really high capacity floppy drive. A disk controller card would be the equivalent to installing a hard drive on your Amiga and the SD card like the floppy. It essentially becomes an install media at that point instead of the everyday use drive.

 

Sent from my iPhone using Tapatalk

Share this post


Link to post
Share on other sites
On 7/9/2020 at 10:53 AM, Lorin Millsap said:

As to accessing IO space there are times when you would want DMA to alter IO

There is a small point here that I have not seen discussed yet.  Expansion card DMA can address only zero wait-state targets.  There is no way to extend a DMA cycle because RDY is already negated to halt the 65C02.

This doesn't seem like a big restriction.  Assuming it is still true that nothing on the system board adds wait states with RDY, the only concern is DMA to slow expansion cards.  That's very niche.

Share this post


Link to post
Share on other sites
There is a small point here that I have not seen discussed yet.  Expansion card DMA can address only zero wait-state targets.  There is no way to extend a DMA cycle because RDY is already negated to halt the 65C02.
This doesn't seem like a big restriction.  Assuming it is still true that nothing on the system board adds wait states with RDY, the only concern is DMA to slow expansion cards.  That's very niche.

That’s a valid point and I don’t think we have a way to handle that. But I agree it’s a very niche circumstance. The main solution I guess is that DMA cards don’t do that for the most part.

A Z80 card would be a case where an expansion is more likely to access more IO. But Z80 has much longer access cycles anyway, which means unless the Z80 is running at really high speeds, like 20 MHz it’s probably not going to outpace most cards.

Most other DMA cards probably won’t be doing that kind of stuff unless you specifically ask them to.


Sent from my iPhone using Tapatalk

Share this post


Link to post
Share on other sites
1 hour ago, Lorin Millsap said:

But Z80 has much longer access cycles anyway, which means unless the Z80 is running at really high speeds, like 20 MHz it’s probably not going to outpace most cards

None of this matters.  The bus cycles are locked to your 8MHz system clock and DMA controllers are required to match the 65C02 protocol.  The internals of the DMA card have nothing to do with what a bus cycle looks like.

Share this post


Link to post
Share on other sites
None of this matters.  The bus cycles are locked to your 8MHz system clock and DMA controllers are required to match the 65C02 protocol.  The internals of the DMA card have nothing to do with what a bus cycle looks like.

Actually that doesn’t matter. Yes they are locked to a degree, but once a DMA access is started the system clock is not entirely relevant. The initial setup and the final release matters of course, but that actual access doesn’t do long as it doesn’t violate component timing. So the fact that a Z80 takes multiple cycles to complete an access is not necessarily a problem. Granted we haven’t tested it. But it should still work. However I will agree that problems will be much less likely if they run on some type of synchronization.


Sent from my iPhone using Tapatalk

Share this post


Link to post
Share on other sites
1 hour ago, Lorin Millsap said:

The initial setup and the final release matters of course, but that actual access doesn’t do long as it doesn’t violate component timing.

Well that's the real trick, isn't it?  I think we are mostly saying the same thing but I will elaborate anyway.

The 6502 bus is fully synchronous and there is no concept of an idle cycle.  Every cycle does something.  There are four "regular" cycles: read, write, read but discard the result, and wait.  Read/discard serves as an idle cycle from the 6502's point of view, but the addressed target has no idea of this.

So while a DMA controller can take as many bus clock cycles as it wants to do its business, every signal needs to follow legal 6502 timing on every cycle.  Whatever target happens to be addressed will do as it is told on every cycle.  Mostly this means address and RWB need to be set to safe values with correct timing on every bus clock.  So a hypothetical Z80 doesn't have to run synchronously with the system bus, but some logic between the Z80 and the system bus definitely does.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...

Important Information

Please review our Terms of Use