Jump to content
  • 0
geek504

Assembly BUG! I am stuck...

Question

I am in the process of writing support assembly code for my BASIC compiler and am stuck with a particular BUG in the routine (PrDec16) that converts and prints an UNSIGNED INT into a DECIMAL STRING (There is also a routine, PrSgnDec16, that prints SIGNED INT which relies on it).

When you compile and run the program, it seems to work BUT the side-effect of the bug is that when you type RUN again, it just clears the screen (it actually prints out something random before it clears). So if I wrote a larger program, it ends up clearing the screen preventing other things to be printed (or it crashed?).

I spent two full days hungting for this bug but cannot find it! Can anyone see where the mistake is?

Quote

        .org $0801
    
        .byte $0C, $08
        .byte $0A, $00
        .byte $9E
        .byte $20
        .byte $32, $30, $36, $34
        .byte $00
        .byte $00, $00
        .byte $00, $00              ; Padding so code starts at $0810

CHROUT        = $FFD2
PLOT        = $FFF0
NEWLINE        = $0D
UPPERCASE    = $8E

r0    = $02
r0L    = $02
r0H    = $03
r1    = $04
r1L    = $04
r1H    = $05
r2    = $06
r2L    = $06
r2H    = $07
r3    = $08
r3L    = $08
r3H    = $09
r4    = $0a
r4L    = $0a
r4H    = $0b
r5    = $0c
r5L    = $0c
r5H    = $0d
r6    = $0e
r6L    = $0e
r6H    = $0f
r7    = $10
r7L    = $10
r7H    = $11
r8    = $12
r8L    = $12
r8H    = $13
r9    = $14
r9L    = $14
r9H    = $15
r10    = $16
r10L    = $16
r10H    = $17
r11    = $18
r11L    = $18
r11H    = $19
r12    = $1a
r12L    = $1a
r12H    = $1b
r13    = $1c
r13L    = $1c
r13H    = $1d
r14    = $1e
r14L    = $1e
r14H    = $1f
r15    = $20
r15L    = $20
r15H    = $21

; Loads a 16-bit Word (immediate) to .A (lo-byte) and .X (hi-byte)
;
.macro    LoadWordAX value
    lda #<value
    ldx #>value
.endmacro

; Loads the 16-bit Word value at address to .A (lo-byte) and .X (hi-byte)
;
.macro    LoadAX address
    lda address
    ldx address+1
.endmacro

; Moves the 8-bit Byte from source to dest
;
.macro MoveB source, dest
    lda source
    sta dest
.endmacro

; Moves the 16-bit Word from source (lo,hi) to dest (lo,hi)
;
.macro MoveW source, dest
    MoveB source+0, dest+0
    MoveB source+1, dest+1
.endmacro

; Store the 16-bit Word in AX to address (lo,hi)
;
.macro    StoreAX address
    sta address
    stx address+1
.endmacro

; Store a 16-bit Word (immediate) to address (lo,hi)
;
.macro    StoreImm value, address
    LoadWordAX value
    StoreAX address
.endmacro

; Print a NEWLINE char
;
.macro    PrintNewline
    lda #NEWLINE
    jsr CHROUT
.endmacro

START:        
        StoreImm $1234, r0    ; decimal = 4660
        jsr PrHex16        ; input at r0
        PrintNewline
        jsr NEG16        ; input at r0, output at r2
        MoveW r2, r0
        jsr PrHex16        ; prints EDCC (decimal = 60876)
        PrintNewline

        ldy #0            ; pad char, 0=none
        jsr PrSgnDec16        ; ### BUG, prints -4660 OK, but prog corrupted
;        jsr PrDec16        ; ### BUG, prints 60876 OK, but prog corrupted

END:        rts

;-----------------------------------------------------------------
; FUNCTION: 16-bit Binary Negation (two's complement)
; INPUT:    r0
;
; OUTPUT:   r2 = 0 - r0 = ~r0 + 1 
;-----------------------------------------------------------------
NEG16:      ; 16 bit Binary Negation: r2 = 0 - r0 = ~r0 + 1 (two's complement)
            SEC           ;Ensure carry is set
            LDA #0        ;Load constant zero
            SBC r0L       ;... subtract the least significant byte
            STA r2L       ;... and store the result
            LDA #0        ;Load constant zero again
            SBC r0H       ;... subtract the most significant byte
            STA r2H       ;... and store the result
            RTS

;-----------------------------------------------------------------
; FUNCTION: Print 16-bit decimal number
; INPUT:    r0 = value to print, copied to scratch r11
;           .Y  = pad character
;           (e.g. '0' #48 or ' ' #32 or #0 for none)
;-----------------------------------------------------------------
PrSgnDec16:
        phy            ; save .Y = padding char
        lda r0H            ; load MSB
            and #$80        ; check for sign bit of MSB
            beq LL1            ; positive, print normally
            jsr NEG16        ; negative, NEG16 result at r2
            lda #'-'        ; print '-' sign
            jsr CHROUT
        MoveW r2, r0        ; load new positive value
LL1:
        ply            ; restore .Y

;-----------------------------------------------------------------
; FUNCTION: Print 16-bit decimal number
; INPUT:    r0 = value to print, copied to scratch r11
;           .Y  = pad character
;           (e.g. '0' #48 or ' ' #32 or #0 for none)
;
; INPUT:    at PrDec16Lp1
;           Y=(number of digits)*2-2, e.g. 8 for 5 digits
;
; OUTPUT:   A,X,Y corrupted
;-----------------------------------------------------------------
PrDec16:    
        STY pad            ; Save new padding character
        MoveW r0, r11
        LDY #8            ; Offset to powers of ten
PrDec16Lp1:
        LDX #$FF            ; Start with digit=-1
        SEC
PrDec16Lp2:
        LDA r11L            ; Subtract current tens
        SBC PrDec16Tens+0,Y
            STA r11L
            LDA r11H
            SBC PrDec16Tens+1,Y
            STA r11H
            INX                ; Loop until <0
            BCS PrDec16Lp2
            LDA r11L  ; Add current tens back in
            ADC PrDec16Tens+0,Y
            STA r11L
            LDA r11H
            ADC PrDec16Tens+1,Y
            STA r11H
            TXA                ; Not zero, print it
            BNE PrDec16Digit
            LDA pad            ; pad<>0, use it
            BNE PrDec16Print
            BEQ PrDec16Next
PrDec16Digit:
            LDX #48                     ; ASC"0", No more zero padding
            STX pad
            ORA #48                     ; ASC"0", Print this digit
PrDec16Print:
            JSR CHROUT
PrDec16Next:
            DEY                ; Loop for next digit
            DEY
            BPL PrDec16Lp1
            RTS
PrDec16Tens:
            .word 1
            .word 10
            .word 100
            .word 1000
            .word 10000
pad:        .res 0            ; default 0 = no padding

;-----------------------------------------------------------------
; FUNCTION: Print 8-bit hexadecimal number
; INPUT:    .A = value to print
;
; OUTPUT:   
;-----------------------------------------------------------------
PrHex8:     ; Print value in A in hexadecimal padded with zeros, 
            PHA      ; Save A
            LSR A    ; Move top nybble to bottom nybble
            LSR A
            LSR A
            LSR A
            JSR PrNybble               ; Print this nybble
            PLA                        ; Get A back and print bottom nybble
PrNybble:
            AND #15                    ; Keep bottom four bits
            CMP #10                    ; If 0-9, jump to print
            BCC PrDigit
            ADC #6                     ; Convert ':' to 'A'
PrDigit:
            ADC #48        ; ASC"0"
            JSR CHROUT     ; Convert to character and print
            RTS

;-----------------------------------------------------------------
; FUNCTION: Print 16-bit hexadecimal number
; INPUT:    r0 = Word to print
;           
; OUTPUT:   
;-----------------------------------------------------------------
PrHex16:
        ;pha
        ;txa
            lda r0H
            jsr PrHex8
            lda r0L
        ;pla
            jsr PrHex8
            rts
 

C:\dev\src>rm test.prg

C:\dev\src>cl65 -o test.prg -t cx16 -C cx16-asm.cfg test.asm

C:\dev\src>\x16-r38\x16emu.exe -prg test.prg -run

test.asm

Edited by geek504

Share this post


Link to post
Share on other sites

19 answers to this question

Recommended Posts

  • 0
1 hour ago, Ender said:

I don't think you can declare variables in the middle of your code like that, at least not with cc65, unless you use segments.  The "PrDec16Tens" and "pad" declarations should be at the end.  From what I can tell, when pad is written to, it's overwriting the first command of PrHex8, changing it from PHA to "BMI $8fd".  Therefore, when you PLA later on the stack gets screwed up.  If you move those declarations to the end it works.

It's perfectly OK to declare variables in the code segment, as long as it's outside the path of execution, as it appears here. RAM is RAM, and everything has to go somewhere.

But indeed, doing a ".res 0" for "pad" is not ok. You need to allocate the number of bytes you plan on writing there.

  • Like 1

Share this post


Link to post
Share on other sites
  • 0

I don't think you can declare variables in the middle of your code like that, at least not with cc65, unless you use segments.  The "PrDec16Tens" and "pad" declarations should be at the end.  From what I can tell, when pad is written to, it's overwriting the first command of PrHex8, changing it from PHA to "BMI $8fd".  Therefore, when you PLA later on the stack gets screwed up.  If you move those declarations to the end it works.

  • Thanks 1

Share this post


Link to post
Share on other sites
  • 0
1 hour ago, Ender said:

I don't think you can declare variables in the middle of your code like that, at least not with cc65, unless you use segments.  The "PrDec16Tens" and "pad" declarations should be at the end.  From what I can tell, when pad is written to, it's overwriting the first command of PrHex8, changing it from PHA to "BMI $8fd".  Therefore, when you PLA later on the stack gets screwed up.  If you move those declarations to the end it works.

Ender, thanks a million for your fresh view on the code analysis! You actually found the spot where the problem was and it was totally off my radar! It is possible to declare variable space within one's code from what I've experienced (one can really make a mess if one really wanted to!) The bug was...

1 hour ago, geek504 said:

PrDec16Tens:
            .word 1
            .word 10
            .word 100
            .word 1000
            .word 10000
pad:        .res 0            ; default 0 = no padding

... pad like you said BUT it was because I was reserving ZERO byte(s) and thus overwriting PrHex8 as you mentioned. It is supposed to be .res 1 and it runs repeatedly without any problems. The ZERO was the remnant of my previous .byte 0 thinking about having a default value for pad, but the routine always write over this space so reserving was more accurate but forgot to change the 0 to 1.

I wasted so much time thinking that I was corrupting .A, .X, .Y, or the many r0-r15 or simply bad assembly code. *sigh*

Edited by geek504
  • Like 1

Share this post


Link to post
Share on other sites
  • 0
3 minutes ago, geek504 said:

change the 0 to 1

I see you figured this out right as I was typing the same thing! Another thing you can do is change the .res to .byte. You should only use .res if you have some odd number or macro-defined amount of space to reserve. If it's just a byte, use .byte and give it a default load value.

  • Like 1

Share this post


Link to post
Share on other sites
  • 0
1 hour ago, SlithyMatt said:

It's perfectly OK to declare variables in the code segment, as long as it's outside the path of execution, as it appears here. RAM is RAM, and everything has to go somewhere.

LOL! As I was reading more webpages on algorithms for integer math and fixed-point math, I came across a text that mentioned that keeping data inside your code is in the realms of "self-modifying code"... woooo... makes us look like uber virus/worm programmers! Not that I ever wrote one during the 80s... 😈

Edited by geek504

Share this post


Link to post
Share on other sites
  • 0

well, it only classifies as self-modifying code if the code is modifying actual code

If you keep a byte in between instructions just as a storage byte, it doesn't qualify IMO 🙂      Now, if you're changing opcodes or (more often) the operand values for certain opcodes, then we're talking 🙂

  • Like 1

Share this post


Link to post
Share on other sites
  • 0

I see it as no different than having a static variable inside a function in C. If you want to be evil, you can put executable instructions in there and call them, but it's not the greatest idea.

Share this post


Link to post
Share on other sites
  • 0
2 minutes ago, SlithyMatt said:

I see it as no different than having a static variable inside a function in C. If you want to be evil, you can put executable instructions in there and call them, but it's not the greatest idea.

I remember putting a shell code inside a string array... 😈

  • Haha 1

Share this post


Link to post
Share on other sites
  • 0
4 hours ago, geek504 said:

.. pad like you said BUT it was because I was reserving ZERO byte(s) and thus overwriting PrHex8 as you mentioned. It is supposed to be .res 1 and it runs repeatedly without any problems. The ZERO was the remnant of my previous .byte 0 thinking about having a default value for pad, but the routine always write over this space so reserving was more accurate but forgot to change the 0 to 1.

Ah right, I was thinking of .byte when I read it and didn't notice it, the same as you 😅  All I knew was that when I stepped through it I could see that that's what was happening so I figured it must be that it was in the middle, based on the coding practices of other stuff I've looked at (I haven't been doing cc65 assembly for long).

Edited by Ender

Share this post


Link to post
Share on other sites
  • 0
7 hours ago, Stefan said:

If I understand your code it's a repeated subtraction method.

I've found some other interesting methods, which have worked nicely.

Thanks for the links... I am always looking for cool 6502 algorithms to include in my support library!

But, I just wanted to print Hex2Dec for the PRINT function... you're suggesting to first convert hex to BCD and then use BCD to print each digit? Would that be substantially faster than repeated subtraction?

I confess, I never used BCD in my life. Can anyone illuminate the usefulness of BCD?

Share this post


Link to post
Share on other sites
  • 0

I haven't done the math to tell which method is faster.

The methods described in the articles are general and scales well. My gut feeling is that they are more effective when dealing with large numbers (more than 16 bits). 

I'm using those functions on 32 bit numbers. First and foremost because of the beauty of the design 🙂

Share this post


Link to post
Share on other sites
  • 0
2 hours ago, geek504 said:

Can anyone illuminate the usefulness of BCD?

BCD is generally faster, and a more scalable process. Also, you can have a running counter that stays in BCD all the time, like a score, that can be quickly rendered as text on screen.

Share this post


Link to post
Share on other sites
  • 0
48 minutes ago, SlithyMatt said:

BCD is generally faster, and a more scalable process. Also, you can have a running counter that stays in BCD all the time, like a score, that can be quickly rendered as text on screen.

Faster? A statement like var++ would be a simply matter of using INC and ADC for large numbers. BCD would require checking for the 9 digit and moving to 0 digit instead of A. Of course, 6502 does native BCD and that might be just as fast? BCD also uses more memory but I can see that BCD to string is much easier!

On another topic, does anyone have a handy algorithm to convert a fraction into a floating-point mantissa (binary form), for example:

N=7, D=3 ==> N/D = 7/3 = 2R1

I want to convert R/D (always less than 1) to the fractional part (mantissa) of a floating/fixed point number:

R/D = 1/3 = 0.3333 ==> Result in Binary : 0.01010101

Share this post


Link to post
Share on other sites
  • 0
6 minutes ago, geek504 said:

6502 does native BCD and that might be just as fast?

Yes, that's the point. Native hardware BCD support is what makes it fast. Generating BCD values with binary arithmetic would be a lot slower.

Share this post


Link to post
Share on other sites
  • 0
2 hours ago, geek504 said:

On another topic, does anyone have a handy algorithm to convert a fraction into a floating-point mantissa (binary form), for example:

N=7, D=3 ==> N/D = 7/3 = 2R1

I want to convert R/D (always less than 1) to the fractional part (mantissa) of a floating/fixed point number:

R/D = 1/3 = 0.3333 ==> Result in Binary : 0.01010101

After spending the afternoon going over binary division just for fun, I probably re-invented the wheel. In any case, here is the algorithm in Python form:

Quote

# Q16 factor for mantissa

N = 1
D = 7
Q = []

print("Binary Division: (N) " + "{0:b}".format(N) + "/" + "{0:b}".format(D) + " (D)")

for i in range(16):

    N = N<<1

    print("Can D=" + "{0:b}".format(D) + " go into N=" + "{0:b}".format(N) + "?")

    if D>N:
        print("No")
        Q.append("0")
    else:
        print("Yes")
        Q.append("1")
        N = N - D
        if N == 0:
            break

    print("N={0:b}".format(N))

print("Result: ", *Q)

F = 0.0

for i in range(16):
    if Q.pop(0) == '1':
        F = F + (1/(2**(i+1)))

print("Fraction: ", F)
 

Feel free to comment! Now I have to convert this to 6502 assembly 😵

Sample run on 1/7 (which happens to be the remainder of the cheap PI value 22/7 = 3.14):

Quote

Binary Division: (N) 1/111 (D)
Can D=111 go into N=10?
No
N=10
Can D=111 go into N=100?
No
N=100
Can D=111 go into N=1000?
Yes
N=1
Can D=111 go into N=10?
No
N=10
Can D=111 go into N=100?
No
N=100
Can D=111 go into N=1000?
Yes
N=1
Can D=111 go into N=10?
No
N=10
Can D=111 go into N=100?
No
N=100
Can D=111 go into N=1000?
Yes
N=1
Can D=111 go into N=10?
No
N=10
Can D=111 go into N=100?
No
N=100
Can D=111 go into N=1000?
Yes
N=1
Can D=111 go into N=10?
No
N=10
Can D=111 go into N=100?
No
N=100
Can D=111 go into N=1000?
Yes
N=1
Can D=111 go into N=10?
No
N=10
Result:  0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0
Fraction:  0.142852783203125

 

Edited by geek504

Share this post


Link to post
Share on other sites
  • 0
1 hour ago, geek504 said:

convert a fraction into a floating-point mantissa

Can't you just perform the division using FDIV rom routine?

Share this post


Link to post
Share on other sites
  • 0
10 minutes ago, desertfish said:

Can't you just perform the division using FDIV rom routine?

The compiler is for Integer BASIC using 16-bit SIGNED INTs. After checking Microsoft's implmentation as well as Woz's and third-party, I decided to use 16.16 Fixed-Point since most of the INTEGER part is done. I just have to integrate the 16-bit MANTISSA part into my FP routines. This "integration" might cost a few execution cycles more but it should work well. Considering that fixed-point is faster to compute than floating-point, it's a small sacrifice. GPUs work like this as well, i.e. they use integer math along with fixed-point math with just the fractional part.

When the compiler is done, I'll add the use of the ROM's floating-point math for serious work!

My goal is to have a compiler that produces fast math at the expense of accuracy. It also uses 8-bit sine/cosine tables with 6% error margin. PI will be just 22/7 for fast computations. Let's see what this Frankenstein will look like in the end! In true Woz spirit, it is designed for game creation!

  • Like 1

Share this post


Link to post
Share on other sites
  • 0
On 10/28/2020 at 5:57 PM, geek504 said:

After checking Microsoft's implmentation as well as Woz's and third-party, I decided to use 16.16 Fixed-Point since most of the INTEGER part is done.

For those who are curious:

e = exponent, s = sign, m=mantissa, i = integer (used in fixed-point)

Microsoft (5-bytes): eeee.eeee | smmm.mmmm | mmmm.mmmm | mmmm.mmmm | mmmm.mmmm

Woz (4-bytes): seee.eeee | smmm.mmmm | mmmm.mmmm | mmmm.mmmm

Bishop (4-bytes): siii.iiii | mmmm.mmmm | mmmm.mmmm | mmmm.mmmm

My version (4-bytes): siii.iiii | iiii.iiii | mmmm.mmmm | mmmm.mmmm

32-bit float has a 7 significant digits precision

16-bit float has a 3 significant digits precision

16-bit fixed mantissa has a 5 s.d.p. plus the significant digits from the INT

The only major drawback of 16.16 fixed-point: it cannot do very very large or very very small numbers. Use ROM float-math for that!

Bishop's version is very interesting because the INT part is only -128 to +127 only, BUT any number larger than +127 the fractional part starts to become negligible, i.e. ~0.78% error margin. In order to take advantage of that requires constant checking for the INT value to determine when to use the Bishop float or normal 16-bit INT. Bishop's fixed-point was used to generate fast mandelbrots in the Apple ][.

 

Edited by geek504

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...

Important Information

Please review our Terms of Use