Jump to content

BruceMcF

Members
  • Posts

    1072
  • Joined

  • Last visited

  • Days Won

    29

BruceMcF last won the day on November 12 2021

BruceMcF had the most liked content!

3 Followers

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

BruceMcF's Achievements

  1. Since he organized the instruction set for ease of hand-assembly, with only CPR having the opcode it has for functional reasons, I do think that saving odd/even in a zero page byte, and cutting the size of the two vector tables in half is the most useful decode.
  2. I didn't have a node, but there was a time that I had a FIDOnet email address as my email address, via a local BBS.
  3. If it was sold at the same price or less than a 6502 at the time of 6502 introduction, possibly not ... the 6809 is a fine instruction set. Regarding the original topic, I've been looking at something I mentioned in another thread: After looking more closely, the same routine can handle byte and word indirect load, the same can handle byte and word indirect store, and with an entry stub byte pop and byte store-pop (before SIGN is examined to see whether to run load or store). So that's six operations with two routines. The same routine can handle add and subtract, and with an entry stub compare. A single routine can handle direct load or store, a single routine can handle increment and decrement. So that's seven more operations with three routines. Among the "embedded register" operations, only word pop (POPD) and SET are singletons, because the way that the first decrements twice in process, which cannot be handled by a prefix to indirect load or store (which post-increment), and setting a value with the contents of the accumulator doesn't make any sense. Even though each routine is longer than the SweetCX16 routines, the reduction in number of routines to seven to cover 15 operations makes the codesize smaller. In the dispatch, after branching to handle the "Branch & etc." ($0n) ops by using the bit4 value to set SIGN to $00 or $FF, clearing bit4 and LSR four times to get one of eight index values from 0 to 14, storing that in X so that JMP (REGOPS,X) based on a 16byte (rather than 30 byte) vector table, saving 14 more bytes. Handily, the index (even numbers from 0 to 14) are in both A and X on dispatch, so if the index is used (as in indirect loads and indirect store to tell whether it's a byte or a word load), you can do "TYX: TAY" to save the index where it can be tested directly with "CPY #n". I haven't tackled the Branch operations, but I am thinking a similar process can be used with the low bit of the operand, since 8 of 13 are by pairs: Branch No Carry / Branch Carry; Branch Plus / Branch Minus; Branch Zero / Branch Nonzero; and Branch if Minus 1 / Branch if not Minus One. If Carry, Minus, Nonzero, and non-Minus 1 are each tested with a result of #$0 if the condition is met and #$FF if the condition is not met, then jumping to BRANCH with EOR SIGN will invert the status for the "odd" operands (Carry, Minus, Nonzero, Not-minus one), and leave the status alone for the "even" operands. Then a branch is performed if the result after EOR SIGN is $#FF. Then Branch Always simply calls BRANCH with a status of $00, since Branch Always is an "odd" op. So that handles 9 of 13 ops. RTN is easy, since it is op $00, "CMP #0 : BEQ RTN". BRK, RS and BS are all singletons, but the dispatch can use the "SIGN" value to distinguish between BK and RS and jump to BS on it's own, so filter out RTN, extract SIGN based on the low bit, clear the low bit, transfer to X and do an X-indexed Jump on a 14 bytes index table ... rather than 26 in SweetCX16 ... crunches the size even more. The hope would be to get smaller than the original Sweet16, so that there is a "faster, large footprint" version and a "slower, smaller footprint" version.
  4. Yes, that's the basic idea ... routines used for system initialization. The original Sweet16 saved more space in the Apple ROM(s?) than Sweet16 used, so it was in a "free" resource in a space consumption sense, at a time when ROM cost much more per KB than it does today. In the context of the X16, the most appealing aspect may be the ability to conserve on relatively scarce Low RAM if a Sweet16 VM is available.
  5. There ought to be differences throughout ... it is not attempting to be a port of Woz's code, it is attempting to be an open source VM that executes Sweet16 source, and beyond that is explicitly focusing on using a faster approach routine dispatch. And in part it is explicitly pursuing a different speed / codesize tradeoff than Woz's Sweet16 because I doubt I could pursue Woz's specific goals and do any better than he did.
  6. Because the purpose of Sweet16 is most often to write very compact "setup" type code ... you are not intended to use Sweet16 inside inner loops executed many times, but in one-off startup processes that would take much more space in 6502 binary code but since its only executed once, using Sweet16 doesn't have much runtime impact. If you are USING the result of the subtraction, you will typically need the result in the accumulator ... even if you want the result somewhere else, you will need it in the accumulator before "putting it somewhere else", so will have to follow, eg, CPR R5 with LD R13, wasting a byte in your Sweet16 code for every subtract operation. Plus, what other SINGLE "register" operation do you need? There are already three spare non-register ops, and any of those COULD be implemented with an operand that refers to one register ... or even two registers. For instance, you could SWAP two registers with the source register index in the low nybble of the operand byte and the destination register index in the high nybble, or multiply two registers with the 32bit result replacing the operands in a similar way. Or you could have shift left and shift right with the low nybble giving the register and the high nybble giving the number of shifts, from 0 to 15. Edit: Actually, I may have convinced myself to substitute the extensions I have in the current source WITH those three ... a register swap and binary shift left and right.
  7. It's also what you will get from a 6502 assembler that supports computed values if you do, eg, "0-512". Back in the 50s and 60s, there was a greater diversity of ways to represent negative values. The other signed representations used back in the day were one's complement, which is just inverting each bit (and which makes $FFFF the 16bit "negative zero") and signed magnitude, where one bit represents the sign and the rest is the absolute value. One's complement is still found in some dedicated signal processing hardware, but two's complement basically took over starting for most purposes in the 70s and is what is assumed as "normal" today. The 6502 is a bit funny in that it does subtraction as a one's complement machine would do ... invert all bits of the operand and add to the accumulator ... but by using "SEC" as the "clear borrow" instruction and "CLC" as the "set borrow'" instruction, it WORKS as a two's complement subtraction.
  8. Old school (1970s era) l assemblers often required a location label to end with a ":" to distinguish it from a value define by an equate ... it seems like most newer assemblers make the ":" optional and work out whether it's an equate or a location from the context. However, standardization of assembler syntax is more about what people are used to than having an explicit standard to follow, so YMMV. I always include the ":" ... that's more from habit than from any view on whether it is "best practice".
  9. Yes, this is the expensiveness of a general stack frame stack in the 6502 family. If it was a 256 deep integer stack implemented as a split byte pushdown X-stack, and i++ is item #4 (zero base) on the stack, it's just: LDX TOS INC STLO+4,X BNE + INC STHI+4,X + ...
  10. Yes ... a retro youtubers chat, a members/retro programming chat, and a "hobbies other than those with a specific chat" chat would be good headings. If the main forum categories are Commander and non-Commander, it's an open question whether to add a members Commander programming chat to the Commander chat section and a members/retro non-Commander programming section to the non-Commander section, but I'd lean in favor.
  11. It's not all at random, though it's definitely not like a microcoded processor instruction set ... more like the 6502 which feels free to take an opcode that doesn't make sense for one type of operation and use it for another. That is aaa d rrrr, address-mode, direction, register rrrr is the 16bit pseudo register, R0-R15 d=0: operand to ACC, d=1, ACC to operand aaa is the operand address mode aaa=000, immediate (followed by 16bit immediate value) aaa=001, register direct aaa=010, register indirect post-increment (lower 8bits, upper 8bits cleared) aaa=011, register double indirect post-increment aaa=100, pre-decrement register indirect aaa=110, pre-decrement register double indirect ... but "0000 rrrr" is a nonsense action (eg, you cannot store the accumulator to the number 768), so instead "rrrr" is a non-register operation. With all of the indirect loads and store being post-increment, you only need one direction of pre-decrement to make a stack. HOWEVER, the single byte pre-decrement needs load AND store, so together they can do a move of a block of data from "back to front", if source is below destination and they overlap. So the single byte "POP" has both directions but the double byte one (to allow 16bit value stacks) only needs one direction. Then there is arithmetic: aaa s rrrr, arithmetic-op, sign, register s= sign, 0=+ (plus), 1=- (minus) aaa = 101, sum, ACC = ACC +/- register, set branch carry, zero, negative conditions aaa = 110, sum value = ACC +/- register, set branch carry, zero, negative conditions, discard value aaa = 111, inc/decrement, register = register +/- 1 Of course, 6 load/store operations and 3 arithmetic operations do not fit into 3bits, except the comparison operation only needs to subtract, and double byte pre-decrement only needs to work in one direction, so that lets it fit together like a jigsaw puzzle. Edit: Note that while the register in the bottom and the instruction at the top is for functional reasons, there is ONE instruction that is almost implied by the design, which is the CPR Rn, since when beginning execution, the four operation bits end up in bits 1-4 of the Y register (for the instruction table look-up), and CPR uses that to give the index of the target for the subtraction, which the CPR instruction places in R13 rather than R0 (the accumulator). So the CPR opcode has to be $Dn, unless the CPR result register is relocated. And then that implies that the two-byte POP instruction is at $Cn, by the "jigsaw puzzle" logic above. Since I was attempting a re-implementation, I focused on the description of the functioning of the operations rather than Woz's implementation. However, even with a different dispatch model, if trying to squeeze object size in a "Sweet 16 replacement", rather than optimizing for speed, I could imagine have a single indirect load and a single indirect store routine, which works out from the bits of the opcode and the status of the carry flag whether it is pre-decrement or post-increment and whether it is a single or double byte operation, covering 7 operations in two routines. Direct register moves could be handled by putting source in Y and destination in X, at the cost of using absolute rather than direct addressing for the Y-indexed operation, giving one routine the two direct ones. One could imagine the immediate register load being run by the two-byte accumulator load, setting the indirect source register to R15, the PC register, and using Y-indexed store, so the immediate load is taken over by the single indirect load routine as well. Then at the cost of three more zero page bytes ... two more bytes in a dedicated "register 17" initialized to $0001, and one set to either $80 or $00 based on whether adding or subtracting, setting up the correct target and operand index in X and Y would all allow all five arithmetic operations to be done in a single routine. If that was done by shifting the instruction one bit to the left and using the carry flag and sign flag to split the code set into quarters, you might restrict the jump table to the $0n instructions, making it only 26-32 bytes long.
  12. Yes. That looks like a regression. Back then, I was primarily developing in VICE emulating a 65C02 in a C64, so the $CC00 branch saw more testing.
  13. One thing to be careful about is that somebody interfacing with the PS/2 keyboard they have doesn't guarantee that the same code will interface with every keyboard that obeys the PS2 spec. One possibility, which is waiting on Micheal Steil having time for the X16 project to open up again, is that the timeout on the 65C02/6522 code is just too fast when the code runs at 8MHz, and adjusting the timeout will fix the issue.
  14. I'm not going to say that at my age, it's getting toward a 10 inch 720p portable TV being a "retina" display for me ... ... but I ain't going to deny, either. After Christmas for the grandkids, things are tight enough now that buying a tin of Altoid mints for a "system in an Altoids tin" build pretty much exhausted my disposable income ... ... that is, not the parts of the "system in an Altoids tin" ... just the Altoids themselves ... ... but hopefully by March or April, there will be breathing room again.
×
×
  • Create New...

Important Information

Please review our Terms of Use