Jump to content

How to implement 16-bit operations in assembly language


Johan Kårlin
 Share

Recommended Posts

When I started programming in assembly language for the X16, I began with implementing all 16-bit operations manually or how to express it. It is good for learning but after a while it gets tiresome. So I started to write macros to make things easier. Since then I have been thinking about how this is done best? (I use ACME syntax in the following.)

Alternative 1 - what I use to day
A library of macros like (I leave the implementation):

+Add16A .addr1, addr2          ;OUT: result in addr1 (A stands for absolute addressing mode)
+Add16I .addr, value          ; OUT: result in addr (I stands for immediate addressing mode)
+Inc16 .addr 
                        
Alternative 2 - three virtual 16 bit registers
A library of macros that implement three (or more) virtual registers that are located in the zeropage,
I think this is an elegant solution. There is a direct equivalence between 8-bit and 16-bit operations. First I define this:

a = $70    ;zeropage location for virtual 16-bit register
x = $72
y = $74

!macro ldai .val {          ;(I stands for immediate)
        lda #<.val
        sta a
        lda #>.val
        sta a+1
}

!macro sbci .val { 
        lda a
        sec
        sbc #<.val
        sta a
        lda a+1
        sbc #>.val
        sta a+1  
}

(...lots of more macros ...)

then I can code like I do for 8-bit operations, I just change the ordinary instructions to the corresponding macros that have almost the same names.  For example, to do a 16-bit subtraction:

        +ldai 1200
        +sbci 650
        +sta .myvariable

.myvariable  !word 0

Of course, the drawback is that the generated code is inefficient, the macros for loading and storing registers add a lot of instructions.

Alternative 3 - your solution
What is your way of doing this? Have I missed something that all experienced assembly language programmers know? 
I have heard about Steve Wozniak's Sweet16 but that is  something else as I understand it, a tiny interpreted language that executes code very slowly.

Link to comment
Share on other sites

My personal solution so far has been to write progressively smarter macros. For instance, see the add/sub macros in the math header in my current "toy" branch: math.inc.

Nevermind the "interleaved array" and "multi-byte array" stuff at the bottom, that's never been tested and while I haven't entirely abandoned the ideas, they've been sidelined because I haven't made regular use of the memory layouts they're intended to optimize. Also, I haven't gone back and rewritten the multiplication macros to be smarter, yet, either.

I'm writing for cc65/ca65, so I have the benefit of some pseudo-ops that allow me to detect whether the macro has been passed an argument that is intended to be an immediate value (in ca65, these are prefixed with "#"). These allow me to encapsulate the logic without creating as many distinct macros with subtle, coded name differences.

Link to comment
Share on other sites

There isn't an answer.  You can improve things,  but either you concede space or you concede CPU cycles. Sweet16 is a simple 16 bit VM that has a 90% hit on CPU speed. I'd probably go for space with a reasonably intelligent cross page call and stick it all in A000-BFFF, and use main space for data and runtime. Loading doesn't work yet though.

Link to comment
Share on other sites

in Prog8 it's a bit of a hodge podge, sometimes I use inline operations to work on 16 bits if they're trivial, sometimes I simply JSR into a library subroutine...  In the end it seems to work but the resulting assembly code is not very consistent.   That's not really a problem because usually you're not supposed to read it 🙂      

I think what I meant is the same what Paul said above; there is not a single best solution ?

Link to comment
Share on other sites

Thanks for your answers! 

@StephenHorn I took a look at math.inc. It is clearly a better way of doing it compared to what I am doing now. I guess I have reached the limit for what ACME can do. Both for this and other macro implementations it would really be a relief to not have to write multiple macros just to cope with the fact that parameters can be passed by value or reference. After my current project I will probably start developing with ca65 instead. By the way - why not finish math.inc and upload it as a development tool : )?

Link to comment
Share on other sites

  • 2 weeks later...
On 12/8/2020 at 8:18 AM, Johan Kårlin said:

Alternative 3 - your solution
What is your way of doing this? Have I missed something that all experienced assembly language programmers know? 

I don't know how common this is, but I've been doing pretty much all my arithmetic on a stack in zeropage using zp,x addressing. I've written a bunch of macros to do 16 bit arithmetic, mostly only operating on the top of the stack. here's a sample of that: (written for 64tass)

Quote

; add two 16 bit numbers from the stack
; ( A B -- A+B )
d_adc16: .macro

    lda 2,x
    adc 0,x
    sta 2,x

    lda 3,x
    adc 1,x
    sta 3,x

    inx

    inx
.endm

; shift 16-bit number on the stack to the left
; ( X -- X>>1 )
d_asl16: .macro
    asl 0,x
    rol 1,x
.endm

; push the accumulator onto the stack
; ( -- a )
d_push: .macro
    dex
    sta 0,x
.endm

; push 16-bit constant onto the stack
d_push16c: .macro
    lda #>\1
    d_push
    lda #<\1
    d_push
.endm

As long as you're careful, you can freely mix 8 and 16 bit values on the stack. Unlike using some fixed registers, you never have to worry about whether calling a subroutine will clobber registers. On the other hand though, you have to save and restore the x register anytime you need to use it for something other than a stack pointer.

If you're comfortable with RPN and want to do all  your arithmetic on a stack, it's a really good solution. It also works really well for passing parameters into and out of subroutines.

Link to comment
Share on other sites

On 12/8/2020 at 3:27 PM, desertfish said:

in Prog8 it's a bit of a hodge podge, sometimes I use inline operations to work on 16 bits if they're trivial, sometimes I simply JSR into a library subroutine...  In the end it seems to work but the resulting assembly code is not very consistent.   That's not really a problem because usually you're not supposed to read it 🙂      

I think what I meant is the same what Paul said above; there is not a single best solution ?

That's what my compilers tend to do. Add/Sub are done in code. Div/Mul/Mod are obviously going to be done as a routine on a 65C02 as they take shedloads of cycles. Or/And/Xor and shifts , well, it depends 🙂 You can make a case for both. The problem is the same one ; this computer has the wrong processor.

Link to comment
Share on other sites

@lamb-duh That is an interesting solution. RPN brings memories to life, in high school, the coolest of my classmates had Texas Instrument calculators using this. I have to dive deeper into this. Interesting that you mention passing of parameters too. I have tested once to use inline parameters for and then use the ordinary stack to access them from the subroutine. Clever but adds extra code. But if you use your own stack, of course this is suddenly a lot easier.

@paulscottrobson Well, right or wrong processor, I don't know. The only real experience of assembly language I have before this is from the Amiga and I can admit that I sometimes miss the 68000. I guess the discussion of how limited a retrocomputer should be will continue forever : ).

Link to comment
Share on other sites

23 hours ago, Johan Kårlin said:

 

@paulscottrobson Well, right or wrong processor, I don't know. The only real experience of assembly language I have before this is from the Amiga and I can admit that I sometimes miss the 68000. I guess the discussion of how limited a retrocomputer should be will continue forever : ).

It's the balance. I've programmed for the RCA Studio 2 (mad) and the Microvision (madder) but everything was at the same level, you had really weak hardware and a weak CPU. Like an Atari VCS the first thing you ask is "is this actually possible ?". The processor and hardware are way too out of balance here.

Link to comment
Share on other sites

  • 2 weeks later...
On 12/8/2020 at 1:22 PM, Johan Kårlin said:

Thanks for your answers! 

@StephenHorn I took a look at math.inc. It is clearly a better way of doing it compared to what I am doing now. I guess I have reached the limit for what ACME can do. Both for this and other macro implementations it would really be a relief to not have to write multiple macros just to cope with the fact that parameters can be passed by value or reference. After my current project I will probably start developing with ca65 instead. By the way - why not finish math.inc and upload it as a development tool : )?

Bump.

There is a drawback, though, to trying to use smart macros. So I have tried to go ahead and expand on the smart macros for my math.inc, and things mostly worked... mostly. I'm having some issues with certain cases being mis-identified, resulting in bugs. For instance, an easy way to foil the smart macros seems to be anything in the form of this:

Quote

.repeat 3, i
   MY_MACRO #(arg+i)
.endrep

This should be an immediate (prefixed with "#"), but the assembler macro logic to detect that fails.

I'm not sure if this is a bug with CA65, but more generally, you can force syntax errors with statements like this:

Quote

.out .string(#1)

...which I would expect would cause the assembler to output "#1".

So. Smart macros. A clever idea, but if there are bugs like this then maybe I'll end up leaving them be while *also* implementing dumber, more explicit versions.

(This is the second assembler with which I've managed to hoist myself on my own petard, trying to make a smart macro library. At least this time I'm able to diagnose where the problem lies.)

Link to comment
Share on other sites

On 12/8/2020 at 8:22 PM, Johan Kårlin said:

I guess I have reached the limit for what ACME can do. Both for this and other macro implementations it would really be a relief to not have to write multiple macros just to cope with the fact that parameters can be passed by value or reference.

What I have done in the past is add an optional parameter to my ACME macros to indicate if the variable is an absolute or an indirect, like this.

Quote

!macro LOADVAL .val {

lda .val  ; load absolute value into .A register

}

!macro LOADVAL .val, .immed {

lda #.val  ; load immediate value into .A register

}

+LOADVAL $10  ; Load value stored at address $0010 into .A register

+LOADVAL $10,1 ; Load the value #$10 into the .A register

You still have to write separate macro's to handle the different scenarios, but as a macro with an optional parameter can use the macro without the optional parameter, you can just "add-on" the needed code to handle the optional parameters.

For a few more examples of macro's with optional parameters, have a look at the VERA_SET_ADDR macro in my VERA include file:

https://github.com/JimmyDansbo/cx16stuff/blob/master/vera0.9.inc

Another way of doing it could be:

Quote

!macro LOADVAL .val, .immed {

if .immed = 0 {

    lda .val  ; Load absolute value into .A register

} else {

    lda #.val  ; Load immediate value into .A register

}

}

+LOADVAL $10, 0  ; Load value stored at address $0010 into .A register

+LOADVAL $10, 1  ; Load the value #$10 into the .A register

Edited by JimmyDansbo
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...

Important Information

Please review our Terms of Use