Jump to content

X8 memory layout


pzembrod
 Share

Recommended Posts

@The 8-Bit Guy @Frank van den Hoef

Hi David, hi Frank, (taking this topic off the "Change of product direction, good and bad news!" thread as @TomXP411 rightly suggested)

it's cc64's author here. Can I ask a question about the X8's RAM layout? Where do the 64K live? Is there RAM underlying the ROM, C64-style? Are the 24K RAM that aren't needed for $0000-$9EFF mapped in as 3 RAM banks at $A000-BFFF? Are there just regular 8K mapped in at $A000-BFFF, as in: just 1 bank?

I'm asking because the base RAM from $0800-$9EFF turns out to be the limiting factor for further development of cc64, and I'm thinking about how to use banked RAM for code to overcome that limit; if banked RAM isn't available, then that would cap cc64 development on the X8. And even now, just compiling cc64 on the X16 needs one bank at $A000 to be present.

Cheers
/Philip

  • Like 2
Link to comment
Share on other sites

1 hour ago, pzembrod said:

@The 8-Bit Guy @Frank van den Hoef

Hi David, hi Frank, (taking this topic off the "Change of product direction, good and bad news!" thread as @TomXP411 rightly suggested)

it's cc64's author here. Can I ask a question about the X8's RAM layout? Where do the 64K live? Is there RAM underlying the ROM, C64-style? Are the 24K RAM that aren't needed for $0000-$9EFF mapped in as 3 RAM banks at $A000-BFFF? Are there just regular 8K mapped in at $A000-BFFF, as in: just 1 bank?

I'm asking because the base RAM from $0800-$9EFF turns out to be the limiting factor for further development of cc64, and I'm thinking about how to use banked RAM for code to overcome that limit; if banked RAM isn't available, then that would cap cc64 development on the X8. And even now, just compiling cc64 on the X16 needs one bank at $A000 to be present.

Cheers
/Philip

Yes, I have a similar issue for xForth ... as a best case, if it is simply three 8K banks, one of them used by the system, that scratches one of the uses of the CX16 banks I have sketched out (where I am waiting for the official release of the emulator to be updated to the Proto#3 bank registers), but leaves the other two live.

  • Like 1
Link to comment
Share on other sites

14 hours ago, pzembrod said:

@The 8-Bit Guy @Frank van den Hoef

Hi David, hi Frank, (taking this topic off the "Change of product direction, good and bad news!" thread as @TomXP411 rightly suggested)

it's cc64's author here. Can I ask a question about the X8's RAM layout? Where do the 64K live? Is there RAM underlying the ROM, C64-style? Are the 24K RAM that aren't needed for $0000-$9EFF mapped in as 3 RAM banks at $A000-BFFF? Are there just regular 8K mapped in at $A000-BFFF, as in: just 1 bank?

I'm asking because the base RAM from $0800-$9EFF turns out to be the limiting factor for further development of cc64, and I'm thinking about how to use banked RAM for code to overcome that limit; if banked RAM isn't available, then that would cap cc64 development on the X8. And even now, just compiling cc64 on the X16 needs one bank at $A000 to be present.

Cheers
/Philip

I'm not in any fashion related to the dev team, so take with a grain of salt, but going by the Verilog design for X8 published last December:

The 64K of RAM lives at address $0000-$FFFF, with a few exceptions. RAM banking only applies to VRAM which has 256 pages of 256 bytes mapped in at $0400. Sprite, PSG, and Palette RAM live in the $0500 page; and $0600-$0617 (24 bytes) are used for general hardware control and IO (e.g. VRAM bank select is at $0605). The only "ROM" is a "Boot ROM" overlay at the top two pages ($fe00-$ffff) and can be disabled / turned into normal RAM by, I think, clearing bit 3 of $0602 (can't quite tell which due to the lateness of the hour, but it looks like the logic is inverted in the design where it is labeled "bootrom_dis", so I would imagine that bit is functionally "bootrom overlay enabled").

$0000-$03ff, and $0618-$fdff are always RAM and writes to $fe00-$ffff are always written to RAM but vary their read behavior depending on what the bootrom overlay bit's value is. Presumably the bootrom's job is to fetch the OS kernel from SPI and write it into RAM; it should then disables the bootrom overlay and execute entirely out of RAM.

  • Like 2
  • Thanks 4
Link to comment
Share on other sites

1 hour ago, Wavicle said:

I'm not in any fashion related to the dev team, so take with a grain of salt, but going by the Verilog design for X8 published last December:

The 64K of RAM lives at address $0000-$FFFF, with a few exceptions. RAM banking only applies to VRAM which has 256 pages of 256 bytes mapped in at $0400. Sprite, PSG, and Palette RAM live in the $0500 page; and $0600-$0617 (24 bytes) are used for general hardware control and IO (e.g. VRAM bank select is at $0605). The only "ROM" is a "Boot ROM" overlay at the top two pages ($fe00-$ffff) and can be disabled / turned into normal RAM by, I think, clearing bit 3 of $0602 (can't quite tell which due to the lateness of the hour, but it looks like the logic is inverted in the design where it is labeled "bootrom_dis", so I would imagine that bit is functionally "bootrom overlay enabled").

$0000-$03ff, and $0618-$fdff are always RAM and writes to $fe00-$ffff are always written to RAM but vary their read behavior depending on what the bootrom overlay bit's value is. Presumably the bootrom's job is to fetch the OS kernel from SPI and write it into RAM; it should then disables the bootrom overlay and execute entirely out of RAM.

So it's like a 6502 version of a classical 64K CP/M 2.2 system.

So the Kernel will be in RAM at the top, don't overwrite it if you want to use it, Basic below it, from the basic load point to the bottom of the basic interpreter code is the basic program and variable space. If you have a program that doesn't use Basic, use the RAM holding the Basic interpreter if you want. Calling the system cold start reloads the basic again. Or cycling power.

That also answers the question of how much simulated ROM: 512 bytes. 

That all sounds like fitting everything into the same logic resource as Vera has available plus the existing 128kB internal RAM. So it's sounds like it's the thing.

Edited by BruceMcF
Link to comment
Share on other sites

1 hour ago, Wavicle said:

I'm not in any fashion related to the dev team, so take with a grain of salt, but going by the Verilog design for X8 published last December:

The 64K of RAM lives at address $0000-$FFFF, with a few exceptions. RAM banking only applies to VRAM which has 256 pages of 256 bytes mapped in at $0400. Sprite, PSG, and Palette RAM live in the $0500 page; and $0600-$0617 (24 bytes) are used for general hardware control and IO (e.g. VRAM bank select is at $0605). The only "ROM" is a "Boot ROM" overlay at the top two pages ($fe00-$ffff) and can be disabled / turned into normal RAM by, I think, clearing bit 3 of $0602 (can't quite tell which due to the lateness of the hour, but it looks like the logic is inverted in the design where it is labeled "bootrom_dis", so I would imagine that bit is functionally "bootrom overlay enabled").

$0000-$03ff, and $0618-$fdff are always RAM and writes to $fe00-$ffff are always written to RAM but vary their read behavior depending on what the bootrom overlay bit's value is. Presumably the bootrom's job is to fetch the OS kernel from SPI and write it into RAM; it should then disables the bootrom overlay and execute entirely out of RAM.

Thank you for this, but I'm still baffled why would couldn't have gotten this on the first day.

But anyway, $0000 to $00ff is of course zero page, $0100-$01ff is stack, $0200-$03ff is going to be basic/kernal variables, the rest as you mentioned.  I'm gonna guess basic program storage still starts at $0800, leaving $0618-$07ff as a free/ML area, so $0800 to $9fff (38912 bytes), unless that 512 bytes of Boot ROM pushes the Kernal and BASIC down to $9e00, but even so, that's 38,400 bytes for BASIC, which is what you'd expect.  Not using BASIC would then give you a contiguous block $0618 - $bdff, 47080 bytes.

Link to comment
Share on other sites

There’ll be some working out how far down the Kernel functions they actually use reside to squeeze out some more space.

Since the bootrom is a read-only overlay, it is likely it loads the basic/kernel code all the way to $FFFF before it turns itself off. Just make sure that the instruction waiting to be executed when the bootloader turns off is the call to start Basic.

Edited by BruceMcF
Link to comment
Share on other sites

10 minutes ago, x16tial said:

Thank you for this, but I'm still baffled why would couldn't have gotten this on the first day.

I feel you may want to dial it down a bit. Give them a break!

10 minutes ago, x16tial said:

But anyway, $0000 to $00ff is of course zero page, $0100-$01ff is stack, $0200-$03ff is going to be basic/kernal variables, the rest as you mentioned.  I'm gonna guess basic program storage still starts at $0800, leaving $0618-$07ff as a free/ML area, so $0800 to $9fff (38912 bytes), unless that 512 bytes of Boot ROM pushes the Kernal and BASIC down to $9e00, but even so, that's 38,400 bytes for BASIC, which is what you'd expect.  Not using BASIC would then give you a contiguous block $0618 - $bdff, 47080 bytes.

Let's wait for Frank or David to chime in, but if Wavicle is right, then I'd rather expect the BASIC interpreter to live from $8000-$BFFF - unless I'm mistaken the extended X16 BASIC is 16k; at least it lives in a 16k ROM bank.

Link to comment
Share on other sites

15 minutes ago, pzembrod said:

Let's wait for Frank or David to chime in, but if Wavicle is right, then I'd rather expect the BASIC interpreter to live from $8000-$BFFF - unless I'm mistaken the extended X16 BASIC is 16k; at least it lives in a 16k ROM bank.

Of course what I said isn't authoritative, and should be confirmed or corrected. 
Edit: yes, if BASIC is 16K, that would put it at $8000, or $7e00, adjusting for the Boot ROM.  Leaving then 30208 bytes for BASIC RAM.

Edited by x16tial
Link to comment
Share on other sites

Yes, the Kernel may not have all of the CX16 code, so it may be less than a full 16KB. The original C64 KERNAL was under 7KB, after all, with the first 1KB+ of the "KERNAL ROM" holding spillover code for Basic.

For instance, I don't think there is an IEC port, so all the IEC Kernal code (and a number of IEC specific kernel calls) might be omitted.

Edited by BruceMcF
  • Like 1
Link to comment
Share on other sites

9 hours ago, BruceMcF said:

Yes, the Kernel may not have all of the CX16 code, so it may be less than a full 16KB. The original C64 KERNAL was under 7KB, after all, with the first 1KB+ of the "KERNAL ROM" holding spillover code for Basic.

For instance, I don't think there is an IEC port, so all the IEC Kernal code (and a number of IEC specific kernel calls) might be omitted.

(Insert same non-authoritative source disclaimer here.)

Expansion options for the X8 are very limited. There are no VIAs in the design. USB seems to be the only general purpose IO interface, but I'm not certain that it is reasonable to use for anything beyond a keyboard, mouse, or joystick. USB host controller software is not an area I have much experience in.

There is one SPI interface and two chip select lines: one for the flash ROM; and one for the SD card. The SPI CS lines can be controlled by the CPU through two bits in one of the IO registers. Setting the register bits to 'b01 asserts the SD select, 'b02 asserts the flash select, 'b00 and 'b11 will deassert both. It should be possible to attach an external SPI GPIO expander (e.g. MAX7301AAX) by ORing the two CS lines together and using that as the GPIO expander's CS wire -- provided the SPI and CS wires are accessible.

Ideally, we'd have two available pins and could either bit-bang I2C or use the FPGA's built-in I2C hard IP. Some possibilities for pins to repurpose:

  • There is one possibly unused pin; it is not routed to anything internally and cannot be used by software.
  • One pin drives an LED and is internally connected to the SPI busy signal, it effectively acts as a disk activity light.
  • One pin is connected to a button and appears to be used as a "soft reset" which resets the designed components in FPGA but not the FPGA itself (i.e. the FPGA doesn't have to bootstrap its configuration from flash after pressing this button).
  • Two pins are dedicated to a debug serial interface which is internally connected to a UART driver and accessible to software via IO registers. This seems like an attractive candidate for dual-purposing as both a debug UART and I2C, mode selectable with a bit in an IO config register.

 

Link to comment
Share on other sites

11 hours ago, x16tial said:

unless that 512 bytes of Boot ROM pushes the Kernal and BASIC down to $9e00

The boot ROM "disappears" at the flip of a bit in the IO configuration register. Writes to the boot ROM area are written to RAM regardless of the configuration. I suspect that the boot ROM only "exists" for long enough to copy the kernel from flash into RAM. It should not displace anything.

  • Like 2
Link to comment
Share on other sites

36 minutes ago, pzembrod said:

@Wavicle, did you see any indication where the DOS for the SDCard might be living? On the X16 it occupies an entire ROM bank, IIUC.

Just realized that this would be tricky with only 64K of memory overall ...

That's a software/firmware implementation detail not reflected in the hardware design. I don't have access to the X8 ROM code as of the end of last year. Even if I did, firmware generally changes more rapidly and far later than hardware, so it'd come with much larger caveats about the reliability of any analysis I do. Because the X8 ROM likely contained Cloanto IP, it was probably not BSD 2-clause licensed which might be why it's much harder to find.

Edited by Wavicle
  • Like 1
  • Thanks 1
Link to comment
Share on other sites

9 hours ago, Wavicle said:

(Insert same non-authoritative source disclaimer here.)

Expansion options for the X8 are very limited. There are no VIAs in the design. USB seems to be the only general purpose IO interface, but I'm not certain that it is reasonable to use for anything beyond a keyboard, mouse, or joystick. USB host controller software is not an area I have much experience in.

There is one SPI interface and two chip select lines: one for the flash ROM; and one for the SD card. The SPI CS lines can be controlled by the CPU through two bits in one of the IO registers. Setting the register bits to 'b01 asserts the SD select, 'b02 asserts the flash select, 'b00 and 'b11 will deassert both. It should be possible to attach an external SPI GPIO expander (e.g. MAX7301AAX) by ORing the two CS lines together and using that as the GPIO expander's CS wire -- provided the SPI and CS wires are accessible.

Ideally, we'd have two available pins and could either bit-bang I2C or use the FPGA's built-in I2C hard IP. Some possibilities for pins to repurpose:

  • There is one possibly unused pin; it is not routed to anything internally and cannot be used by software.
  • One pin drives an LED and is internally connected to the SPI busy signal, it effectively acts as a disk activity light.
  • One pin is connected to a button and appears to be used as a "soft reset" which resets the designed components in FPGA but not the FPGA itself (i.e. the FPGA doesn't have to bootstrap its configuration from flash after pressing this button).
  • Two pins are dedicated to a debug serial interface which is internally connected to a UART driver and accessible to software via IO registers. This seems like an attractive candidate for dual-purposing as both a debug UART and I2C, mode selectable with a bit in an IO config register.

 

But OR'ing the two lines and putting the result through a free pin uses up a pin.

Instead, implement the lines so that b00 does not de-select both CS lines, but simply send the state through the pins. Then an external active low decoder can generate three CS lines from the 01, 10 and 00 outputs, 11 is not attached, so that is all devices deselected.

If there is no free pin, then use it to select an I/O expander, that's the effective User port.

If there is a free pin, there's an option for a four "slot" external SPI bus through a block pin header.

The decoder would select a serial latch. You write a byte to the serial latch. It could even be a serial latch with a carry, so you get back the previous setting of the serial latch on the MISO line.

Then the free pin selects the output enable of the serial latch, so the you write xxxx1110, xxxx1101, xxxx1011, or xxxx0111 to select one of four SPI devices that can be placed on the SPI bus.

Put the SPI bus and the select latch outputs on a block pin header with power and ground and the select lines pulled high by a resister block, there's your hat connector for SPI based expansion boards.

EDIT: After looking more closely at the MAC3701 datasheet, if there is a free pin, one might still use a MAX3701 20pin or 28pin GPIO I/O expander and use the free pin for the interrupt line that can be set up for a subset of the ports when they are in transition detect mode.

9 hours ago, pzembrod said:

@Wavicle, did you see any indication where the DOS for the SDCard might be living? On the X16 it occupies an entire ROM bank, IIUC.

Just realized that this would be tricky with only 64K of memory overall ...

If DOS is boot loaded from the SD card, the simplest implementation would be to just load the BASIC and DOS as a single block, and have a magic address the bootloader jumps to that executes the code to turn off the bootloader and start the regular reset process. Then the space for DOS would be completely flexible.

In that implementation, the hardwired part would be the base load address for the Basic/Dos block. The boundary between Basic and DOS would be irrelevant to the bootloader.

Edited by BruceMcF
  • Like 1
Link to comment
Share on other sites

37 minutes ago, Fabio said:

if there's room for more spi devices what about a spi ram module?

It's single line SPI, so an SPI Ram module might not be tremendously faster than an SD card.

But if there is an SPI bus, it's an option for X8 hats, along with GPIO, a serial port, RTC, eetc

If there IS a free pin, an external 3>8 decoder on three select bits that just have state go straight through to the pins could have %000 as deselect, %001 as SD select, %010 as flash ROM select, %011 as a built in gpio "User Port" select, and %1xx as four SPI selects on an SPI block pin header for X8 hats.

This is all parallel to the idea of overlaying an I2C function on the two debug pins.

Edited by BruceMcF
Link to comment
Share on other sites

This probably belongs here rather than the MegaThread:

15 hours ago, BruceMcF said:

Heck, I see that the SD card SPI CS and the serial ROM SPI CS is side by side in a control register in the X8, and there might be one spare pin (if it isn't a pin stranded by lack of logic resources), and I'm like, "hey, that's a job for a 74x138, just send three bits out on pins undecoded, and you get 7 alternative SPI selects."

...

... OOH! WAIT! It might be possible to finesse away the need for a spare pin!

Use the next decoder! A 74x139 dual active low 2>4 decoder ... that's the ticket!

See, if you include a GPIO extender, you can use some of that GPIO for system uses. So you have one decoder, /EN tied to ground so it's always on, tied to the two pins that were original SD and serial flashROM CS's. So %01 SD card, %10 serial flash ROM %11 GPIO extender ... %00, SPI expansion bus. ...

Oh, sticker price shock ... I looked at Mouser and one of those SPI 28-port port extenders is like $6+ Q1 ... as a rule of thumb, double for built cost and add 20% operating margin, maybe $15 added to the price of the board.

OK, that locks in the 7x139 decoder approach even more.

I believe the following works, but don't have any hardware to test it, so it's all speculative. Also the X8 specifics that it is based on are unofficial sources as well, so they may be incorrect.

As above, don't decode the SD and flash ROM selects internally, just send the state out to the pins, and put those two pins into one half of the 2>4 decoder. The other half is used to filter the SPI serial clock so that the shift register only receives the serial clock when the serial shift register is "enabled". Both decoders have their output enable tied on.

The first decoder outputs are %01 to the SD card CS, $10 to the flash ROM CS, %11 to the serial shift register register clock and to be decoded by the second decoder, %00 goes to the output enable of the serial shift register.

The low enable from %11 goes into the register clock line of the shift register. It also goes to the second decoder as the b1 input, with the SPI SCLK as the b0 input. The serial shift register clock is the %00 output of the second decoder, so that it only goes low when both serial shift register select and SCLK are low. MOSI goes to the serial input of the serial shift register, MISO goes to the carry output of the serial shift register.

So when you write a byte into the serial shift register, those are the 8 CS on the SPI expansion port that are selected when the SPI select code is %00. To disable the expansion port, you write $00 to the serial shift register and then %00 leaves everything at rest. For the 74x596, pull up resisters are on the X8 board, since the register outputs are open collector.

Hats shouldn't stack too high, so the SPI hat board specification is simple. The block pin header has SCLK, MOSI, MISO, +3.3V, and CS0-CS7.

The hat specification is as simple as could be, since "hat" boards really shouldn't stack too high, and there are for practical purposes chip selects to spare.

If there truly is a spare pin, and it is not just stranded by lack of logical resources to put it to use, a common Alert line input to the X8 from the boards can be provided. In the lowest logic resource implementation, it is simply a bit in some register somewhere that must be polled. If more logic is available, a second bit might set whether it triggers an IRQ.

There is no protocol on how to find out which of up to 8 SPI devices sent the alert ... while some SPI devices can send alerts (UARTs, I2C bus masters, RTC alarms, GPIO edge detection, etc.) there is no universal standard for how to poll an SPI device to see if it has sent an alert, so that is necessarily left "up to the specific SPI part interface".

  • A "top hat" board, without a pass through block header, use chip select 0 optionally up to 3 and ignores chip selects 4-7.
  • A "pass through" board uses chip selects 4 optionally up to 7, and passes chip selects 0-3 through to its own block pin header, while CS4-CS7 are just NC, pulled high with pull up resisters.

______________________

Notes

When this 74x139 decoder is based on the SCLK from a Mode0 SPI master, IIUC the result is not true Mode0 synchronization, since the rising of the output register clock line acting as CS will be accompanied by an "extra" rising serial clock. According to the datasheet, the correct value will be in the output register, but the serial shift register itself will be one shift ahead. So the "echo" of the old contents received on MISO might be seven of the prior bits, shifted one bit. However, this wouldn't affect the functioning of the SPI expansion port CS lines.

With a Mode3 SPI bus master, which is the "natural" mode for the VIA serial shift registers, IIUC, this may be true Mode3 synchronization.

I always thought an interesting upgrade on the bit banged approach for a C64 SPI bus was to set up one serial shift register to receive the MISO data, which just requires wiring the GPIO used to bit bang the serial clock (d0) with the line that can drive the serial shift register from an external clock. Then the bit banging is only handling select, SCLK and MOSI, so you store to Port A with the clock line at d0 set to zero and INC the register to raise the clock. Then clock phase is entirely in software control. The next quite substantial step up in speed (to 500kbps) is to use both serial shift registers, for MISO and MISO, but without circuitry above my head, that ends up with a Mode3 interface, so it cannot talk to an SD card in SPI mode. A few decades back I was assuming that a 16v8 SPLD could implement a bridge to take a Mode3 master in and generate a Mode0 bus, but that assumption was as far as my early 2000's explorations got, as I was never in a position to get a real C64 while I was in Oz.

But I always thought that the "half bit banged" loop was just about as efficient as serial bit banging can get: "SPISTATUS" has the six CS, which go out on d1-d6, except shifted up one, with the low two bits 0:
LDX #8 : LDY SPISTATUS : - TYA : ASL SPIDATA : ROR : STA USERPA : INC USERPA : DEX : BNE -
... so around 25 clocks per bit, or 40kbps on the C64's User Port. There is no need for a "slow mode" of course, since that is well under the 400kbps ceiling during part of some SD card's initialization (though the dual serial shift register is JUST over this threshold, and would have to take this into account).

Edited by BruceMcF
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...

Important Information

Please review our Terms of Use