Jump to content

geek504

Members
  • Content Count

    94
  • Joined

  • Last visited

Community Reputation

44 Excellent

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. @desertfish awesome work! I sometimes wonder how you are able to dedicate so much time on your project! I have been away for a month or so due to graduate studies but I'm finally done for the semester and can now do some fun work on my compiler!
  2. My all-time favorite was the Apple ][+ with the 16K Language Card for 80-columns and a 20MB Apple Cider external hard disk. Most of my friends had a C64 and I loved the games but the overall user experience was definitely better with the Apple. I was then saving for a Commodore 128 but ended having enough for an Amiga 500. I loved it but now, in retrospect, I'd have preferred the Apple Macintosh IIfx if money was not an issue! But of course, the Commander X16 will trump them all... maybe not the IIfx but I like my memory map small... 64KB for the win!
  3. @desertfish curious... which PRNG algorithm did you use?
  4. Thanks for looking at the code @Ender. A speaker beep shouldn't be too hard to do I suppose a TAB should be every 8 position (Apple BASIC)... or 10 (C64) as the "," do in the PRINT statement.
  5. I've been scratching my head on these new control characters, more specifically TAB ($09) and BELL ($7). How does one use it in a PRINT CHR$() command? I even tried "activating" by enabling CHARSET ISO ON/OFF ($F). I am trying to figure out the "tab" or "," or actually print 10 spaces in my PRINT function or I need to make use of the KERNAL PLOT routine.
  6. @desertfish I started coding my compiler without a stack and while I can say it was efficient, it was just too slow (edit: to code and finish the compiler) and prone to bugs... I decided to implement the stack midway just to get the compiler ready and then re-implement the non-stack improvements later if at all. I'm guessing a rough 10-15% speed improvement and am not sure if it is worth the effort right now. 6502 assembly is inefficient by nature (but very simple to implement) especially if we write proper assembly code to preserve A, X, and Y, wasting bytes and cycles with PHA, PHX, PHY, and PLY, PLX, PLA prior to every subroutine. I do feel that this is still more efficient than cc65's C-stack implementation though. I am not worrying too much about maximum efficiency (I don't think we will ever get close to super tight assembly code) because I hope one day in the future we will be able to crank up the X16's MHz to GHz range! At least in the emulator scene, we can implement the following 6502 JIT-core to x64 which could bring to a realistic 12GHz! https://scarybeastsecurity.blogspot.com/2020/04/clocking-6502-to-15ghz.html?m=1
  7. Hi! Is this "software-eval-stack" you mention a software-based stack used for mathematical computations based on a RPN-type stack? E.g. 2+3 becomes 2, PUSH, 3, PUSH, +? This is what I am using for my BASIC compiler. If so, it does do a lot of function calling and can be greatly optimized if one bypasses the stack entirely but involves major compiler modifications as you mentioned. This is a sample code from my compiler using macros that greatly improves readability: As you can see, line 20 does a PUSH and a PULL to/from stack for a simple AX=3. It could have simply copied over the INT 3 directly into VAR AX. I'm planning in writing a post-compiler optimizer much later. Note that I couldn't use VAR A since A is a reserved keyword in ca65!
  8. Out of curiosity, where does ".zeropage" start reserving space? Hopefully not starting from address $0000 since it is prone to corruption of important areas. For X16 it should start at $0022. I assumed it did but now am worried I may have ZP data corruption.
  9. That's what I figured too... but that would only make sense if we could do LSR X or LSR Y
  10. Which one is correct? They both seem to be the same, i.e. they operate on the accumulator. LSR or LSR A Is the first one used in 6502-proper and the latter in 65C02?
  11. I am trying to setup a few variables in ZEROPAGE using ca65 and the following does not seem to work... It seems multiple ".org" doesn't work?
  12. For those who are curious: e = exponent, s = sign, m=mantissa, i = integer (used in fixed-point) Microsoft (5-bytes): eeee.eeee | smmm.mmmm | mmmm.mmmm | mmmm.mmmm | mmmm.mmmm Woz (4-bytes): seee.eeee | smmm.mmmm | mmmm.mmmm | mmmm.mmmm Bishop (4-bytes): siii.iiii | mmmm.mmmm | mmmm.mmmm | mmmm.mmmm My version (4-bytes): siii.iiii | iiii.iiii | mmmm.mmmm | mmmm.mmmm 32-bit float has a 7 significant digits precision 16-bit float has a 3 significant digits precision 16-bit fixed mantissa has a 5 s.d.p. plus the significant digits from the INT The only major drawback of 16.16 fixed-point: it cannot do very very large or very very small numbers. Use ROM float-math for that! Bishop's version is very interesting because the INT part is only -128 to +127 only, BUT any number larger than +127 the fractional part starts to become negligible, i.e. ~0.78% error margin. In order to take advantage of that requires constant checking for the INT value to determine when to use the Bishop float or normal 16-bit INT. Bishop's fixed-point was used to generate fast mandelbrots in the Apple ][.
  13. The compiler is for Integer BASIC using 16-bit SIGNED INTs. After checking Microsoft's implmentation as well as Woz's and third-party, I decided to use 16.16 Fixed-Point since most of the INTEGER part is done. I just have to integrate the 16-bit MANTISSA part into my FP routines. This "integration" might cost a few execution cycles more but it should work well. Considering that fixed-point is faster to compute than floating-point, it's a small sacrifice. GPUs work like this as well, i.e. they use integer math along with fixed-point math with just the fractional part. When the compiler is done, I'll add the use of the ROM's floating-point math for serious work! My goal is to have a compiler that produces fast math at the expense of accuracy. It also uses 8-bit sine/cosine tables with 6% error margin. PI will be just 22/7 for fast computations. Let's see what this Frankenstein will look like in the end! In true Woz spirit, it is designed for game creation!
  14. After spending the afternoon going over binary division just for fun, I probably re-invented the wheel. In any case, here is the algorithm in Python form: Feel free to comment! Now I have to convert this to 6502 assembly Sample run on 1/7 (which happens to be the remainder of the cheap PI value 22/7 = 3.14):
  15. Faster? A statement like var++ would be a simply matter of using INC and ADC for large numbers. BCD would require checking for the 9 digit and moving to 0 digit instead of A. Of course, 6502 does native BCD and that might be just as fast? BCD also uses more memory but I can see that BCD to string is much easier! On another topic, does anyone have a handy algorithm to convert a fraction into a floating-point mantissa (binary form), for example: N=7, D=3 ==> N/D = 7/3 = 2R1 I want to convert R/D (always less than 1) to the fractional part (mantissa) of a floating/fixed point number: R/D = 1/3 = 0.3333 ==> Result in Binary : 0.01010101
×
×
  • Create New...

Important Information

Please review our Terms of Use