Page 2 of 2

Re: Proposal for a hardware-agnostic math accel API

Posted: Sat Feb 24, 2024 1:18 am
by yock1960
A math coprocessor...😲🤤

Re: Proposal for a hardware-agnostic math accel API

Posted: Sat Feb 24, 2024 1:26 am
by ahenry3068
yock1960 wrote: ↑Sat Feb 24, 2024 1:18 am A math coprocessor...😲🤤
We have VERA-FX now.

Re: Proposal for a hardware-agnostic math accel API

Posted: Sat Feb 24, 2024 1:38 am
by m00dawg
Of course! And it's great! Though it is limited to only certain math functions (I think basically just multiply?). Of note,
I'm not trying to compete with FX since it's built in and nails cases that are good for video, obviously. VERA is also, appropriately so, well scrutinized since every X16 gets one and putting in sketchy code would be bad for every single user. It's also a (very) limited resource (noting FX eats up quite a few LUTs making certain things, like the XOR addition for the triangle wave, harder to justify).

Just looking at RISCV soft-core (so not considering optimized constructs on the FPGA directly), I can do lots of the maths, even floating-point if I wanted (not sure that'd be super useful on X16 though) and can simulate SIMD instructions given the higher internal clock speed. My end game is to see if I can make SIMD solutions using Verilog vs just writing code on the core. That would truly allow the S in SIMD at that point.

FX has a specific goal. I was more looking at a general Math API / APU sort of approach. The main reason was having something to work on that wasn't just hello world as a means to help me learn both FPGA stuff as well as RISCV assembly. That's the main goals. I think there's some cases where the APU could be useful. Enough to put a card in an X16? Perhaps not but I think it's fun to work on just the same.

Anyways I took your comment as confrontational but that may not have been your intent. Either way though, I'mna keep at it as there's value for me in the journey as a minimum. As a maximum, it could bring about some interesting thoughts and ideas for future hardware or software.

Re: Proposal for a hardware-agnostic math accel API

Posted: Sat Feb 24, 2024 1:48 am
by ahenry3068
m00dawg wrote: ↑Sat Feb 24, 2024 1:38 am Anyways I took your comment as confrontational but that may not have been your intent. Either way though, I'mna keep at it as there's value for me in the journey as a minimum. As a maximum, it could bring about some interesting thoughts and ideas for future hardware or software.
It certainly wasn't my intent.

Re: Proposal for a hardware-agnostic math accel API

Posted: Sat Feb 24, 2024 1:49 am
by ahenry3068
I'm all for whatever capabilities we can bring to the system. I was just saying we kind of have a limited math co-processor right now.

Re: Proposal for a hardware-agnostic math accel API

Posted: Tue Feb 27, 2024 4:15 am
by m00dawg
My apologies, I didn't get the notification you replied doh.

We do and honestly I'm not sure how much more useful this may end up being. Even with SIMD, there's still the time it takes to load the registers and move things around though if I don't do hidden registers (striding) then those data doesn't have to move around as much depending on the use case. Division is a big one I think might be useful. Really depends on how fast the RISCV or discrete FPGA logic arrangements end up being.

As a general update, Upduino, the bus transceivers, and Kevin (Texelec)'s prototype expansion cards came in (plus some pin headers and random accessories I'll need). I won't likely have a ton of time to work on it for a bit, but I did program the FPGA with the FemtoRV core and mess around with demo LED programs.

Once I get it on the expansion card, I think next test is to light up an LED when writing to the configured MMIO address and start building it up from there. FemtoRV includes some serial code so I can output debugging over USB and that's my plan for how to figure out what's going on inside. That'll just take quite a bit more glue than the example program I made has.

Re: Proposal for a hardware-agnostic math accel API

Posted: Tue Feb 27, 2024 5:46 am
by Ser Olmy
m00dawg wrote: ↑Sat Feb 24, 2024 1:38 amJust looking at RISCV soft-core (so not considering optimized constructs on the FPGA directly), I can do lots of the maths, even floating-point if I wanted (not sure that'd be super useful on X16 though) and can simulate SIMD instructions given the higher internal clock speed.
I actually think it would be super-useful on any platform.

You seem to be proposing a standard similar to the tube interface on the Acorn BBC 8-bit micros. I'd be all for it.

Re: Proposal for a hardware-agnostic math accel API

Posted: Tue Feb 27, 2024 4:53 pm
by m00dawg
Ah I was unaware of Tube but in the final form, essentially yep it would be similar. The X16 bus includes RDY which allows for suspending the main CPU as well. Can make for interesting solutions there. Although that's a ways out but the APU idea can be a workable stepping stone. I'll be sharing how I wired everything up and the schematics and things if I end up making a purpose built card so others could use this or a similar design to make FPGA-based solutions to do different things as well. The X16 bus seems like it can be a little particular so it'll be interesting to see how things go once I start wiring things to the bus directly.

Re: Proposal for a hardware-agnostic math accel API

Posted: Thu Mar 07, 2024 12:00 am
by yock1960
m00dawg wrote: ↑Sat Feb 24, 2024 1:38 am
Just looking at RISCV soft-core (so not considering optimized constructs on the FPGA directly), I can do lots of the maths, even floating-point if I wanted (not sure that'd be super useful on X16 though) and can simulate SIMD instructions given the higher internal clock speed. My end game is to see if I can make SIMD solutions using Verilog vs just writing code on the core. That would truly allow the S in SIMD at that point.
Well, yes floating point on the X16 is pointless....but I still do it and I'll take any speed improvement I can get! :D

Re: Proposal for a hardware-agnostic math accel API

Posted: Fri Mar 08, 2024 2:32 pm
by m00dawg
yock1960 wrote: ↑Thu Mar 07, 2024 12:00 am
m00dawg wrote: ↑Sat Feb 24, 2024 1:38 am
Just looking at RISCV soft-core (so not considering optimized constructs on the FPGA directly), I can do lots of the maths, even floating-point if I wanted (not sure that'd be super useful on X16 though) and can simulate SIMD instructions given the higher internal clock speed. My end game is to see if I can make SIMD solutions using Verilog vs just writing code on the core. That would truly allow the S in SIMD at that point.
Well, yes floating point on the X16 is pointless....but I still do it and I'll take any speed improvement I can get! :D
That would be easy enough to do I think though would need a larger FPGA I think. The FemtoRV's largest core (petitbateau which implements RV32IMFC) supports single precision. I have no idea how fast it would be but surely faster than BASIC. Alas I haven't found a open source core that supports the vector extensions yet, at least which isn't fairly specialized. I read up on the RISCV vector extension while on vacation (weird thing to read on vacation maybe but I enjoyed it :P) and it has some clever design principles that would work well with what I was trying to do with the SIMD style API calls.

Their approach limits how many instructions needed to be added to the ISA to accommodate larger or smaller vectors. These things wouldn't be seen on the X16 side so using the vector extension might not be needed but I still nonetheless found it interesting and it may change how I implement SIMD in the API.