Jump to content
  • 0
DoubleA

Direct vMem manipulation?

Question

Wouldn't it be nice/helpful to directly manipulate vMem (copy/move/clear) with the help of a small but fast custom VERA routine? Without burdening the CPU to much?

One could even call it a blitter :-).

Share this post


Link to post
Share on other sites

11 answers to this question

Recommended Posts

  • 0
Posted (edited)
27 minutes ago, StephenHorn said:

DMA (direct memory access). You're asking for DMA.

Maybe, but when I think of "Blitter", I think of a system that's purpose designed to move video pixels; it might use DMA to do that, but it's not just DMA. Aside from basic memory copies, the minimal thing a blitter needs to have is the ability to repeat an operation for multiple rows, so the simple add-copy-compare operation needs to be wrapped in an additional add-compare-add? step.

I can picture a series of gates in my head that will do that in one clock tick, but there's probably something I"m missing.

All this talk of VERA and her capabilities have made me really want to dig in to FPGA design. I remember being really fascinated with the logic gate simulator in my digital design class, and FPGAs seem like a natural extension of that. 

Edited by TomXP411
  • Like 1

Share this post


Link to post
Share on other sites
  • 0
40 minutes ago, TomXP411 said:

I can picture a series of gates in my head that will do that in one clock tick

That's the basic idea, one tick is as fast as it gets. 

The other "Blitter" i had in mind was somehow crippled, because it had only access to slow (chip) ram. But with the X16, it's the other way round ...

And there's DMA and then the other DMA. This one would be "internal". Could be triggered "immediate" or "IRQ-controlled" or something else, as long as it stays strictly VERA internal.

 

Share this post


Link to post
Share on other sites
  • 0
18 minutes ago, DoubleA said:

And there's DMA and then the other DMA. This one would be "internal". Could be triggered "immediate" or "IRQ-controlled" or something else, as long as it stays strictly VERA internal.

The idea I have from your description is like a sort of programmable action to perform from inside the VERA chip, kind of like shaders for GPU ?

Share this post


Link to post
Share on other sites
  • 0
45 minutes ago, VincentF said:

The idea I have from your description is like a sort of programmable action to perform from inside the VERA chip, kind of like shaders for GPU ?

Kinda 🙂 (but much simpler).

Share this post


Link to post
Share on other sites
  • 0
15 hours ago, StephenHorn said:

DMA (direct memory access). You're asking for DMA.

DMA is available to hardware connected to the expansion slots, right? Does hardware have direct access to video ram, and if so could that be done without interrupting the cpu?

Share this post


Link to post
Share on other sites
  • 0
6 hours ago, lamb-duh said:

could that be done without interrupting the cpu?

What i suggested could be done without CPU or "external" DMA. Would be the same mechanism as initialising sprites, but for "blits". Strictly VERA internal.

BTW: How does VERA handle HSCROLL and VSCROLL? Just changes a pointer or moves vMem content? 

 

Share this post


Link to post
Share on other sites
  • 0
Posted (edited)
3 hours ago, DoubleA said:

What i suggested could be done without CPU or "external" DMA. Would be the same mechanism as initialising sprites, but for "blits". Strictly VERA internal.

BTW: How does VERA handle HSCROLL and VSCROLL? Just changes a pointer or moves vMem content? 

 

As far as I'm aware, it does not move VRAM content, that has never been a part of the emulator's implementation of HSCROLL and VSCROLL.

If I were to guess, it bit-shifts HSCROLL and VSCROLL by 7 bits and copies into internal 16-bit accumulators (representing 9.7 fixed point integers). Then as it draws, it increments the accumulators by HSCALE for each column and VSCALE for each row (which are 1.7 fixed point), and decides the "real X, Y" by reading the top 9 bits of the accumulators (the truncated integer portion).

Edit: Actually, this may be more than a guess, I seem to recall Frank has previously discussed this on Facebook, on a comment thread where someone wanted to change the basis value for scaling from 128 to 240, so as to provide finer-grained scaling, and suggested a relatively fast division-by-5 hardware design to try and make it possible. This seems to have been ultimately rejected. In fairness, I believe the HSCALE and VSCALE were originally only meant to provide power-of-2 scaling, and the VERA had already grown somewhat beyond its original design by taking on features like the PSG, so LUTs and Frank's own time may be at an extreme premium at this point.

Edited by StephenHorn

Share this post


Link to post
Share on other sites
  • 0

Also remember that what finally killed the Gameduino V1.0 modification approach was contention between the 65C02 and the J1 coprocessor. A DMA blitter on an expansion card that reads via Port A and then writes back the modified data via Port B doesn't have that problem because it would be bus mastering ... when it controls the bus, it leaves the 65Cxx address and data lines in a high impedance tri-state and when it surrenders the bus, the 65Cxx is entirely in control.

So not only is it highly unlikely that they WANT to go for a classic 32bit+ system move with a Blitter, but there are reasons to expect it might introduce problems in getting the system up and running.

Share this post


Link to post
Share on other sites
  • 0

The future will tell how much of a constrain is to have no direct access to the memory with the CPU one one hand and the demand to do everything with the CPU in VRAM on the other hand.  Copying larger areas of VRAM will be painful (anyhow as it is a CPU task) but esp. with the VERA implementation as only some structured copy will work efficient with the auto increment. If you need to copy blocks of memory in an X/Y fashion it will require frequent write to the VERA ADDR registers and that will slow down the process. So I assume TILE Mode is the mode to go for anything that needs frequent movement/copying as it can be done in a different way with very low amount of data movement from CPU to VRAM.

Share this post


Link to post
Share on other sites
  • 0

Exactly, that is the design goal. They already had a perfectly capable stack machine co-processor in the Gameduino, and not having a coprocessor in the "Video Chip" seemed like part of the design criteria when the call went out for fresh designs on more recent, non-obsoleted FPGA's.

Ingenuity is the key. You can have a tile based starship command deck, with "display portals" that are animated bitmaps done by reserving a set of font bitmaps for the bitmap graphics. The layout on the screen does not determine how easy it is to update them with the auto-increment ... the layout of the chosen tiles in the font bitmap does.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...

Important Information

Please review our Terms of Use