Jump to content
  • 0

In CA65, what does .SEGMENT do exactly?


C3c
 Share

Question

Hi Everyone!

Like many other people around here, I'm using the CA65 assembler to develop for the X16. One thing I don't really understand is what the .SEGMENT command does exactly. I find the CA65 documentation a bit terse on this matter:

Quote

Switch to another segment. Code and data is always emitted into a segment, that is, a named section of data. The default segment is "CODE".

Could any tell me what exactly happens if I use .SEGMENT "DATA" inside my code?

Thanks!

Cedric

Edited by C3c
forgot to add tags
Link to comment
Share on other sites

4 answers to this question

Recommended Posts

  • 0
On 11/6/2021 at 10:01 AM, C3c said:

Hi Everyone!

Like many other people around here, I'm using the CA65 assembler to develop for the X16. One thing I don't really understand is what the .SEGMENT command does exactly. I find the CA65 documentation a bit terse on this matter:

Could any tell me what exactly happens if I use .SEGMENT "DATA" inside my code?

Thanks!

Cedric

A program is a sequence of bytes and you determine how they are used. You can interleave code and data however you want.

A segment is a way to organize the program more effectively. I'm not an expert with ca65, so this isn't a perfect example, but imagine you have something like this (useless) example:

.data
bytes: .byte 1, 2, 3
.code
lda bytes+0
ldx bytes+1
ldy bytes+2
.data
msg: .byte "this is a test"
.code
nop
nop
nop

The first line says "stuff that comes after goes into the data segment". The data segment is just a sequence of bytes.

The third line says "stuff that comes after goes into the code segment". The code segment is also just a sequence of bytes.

The seventh line switches back to the data segment. Anything that comes after that is appended to whatever is already in the data segment.

The ninth line switches back to the code segment. Anything that comes after that is appended to whatever is already in the code segment.

So after assembling that file, you have two segments, CODE and DATA.

The data segment will have 17 bytes: $01 $02 $03 and 14 bytes representing the text "this is a test".

The code segment will have 12 bytes: 3 bytes for the LDA instruction, 3 for LDX, 3 for LDY, and 3 NOP instructions.

You can have multiple source files that all use the same segments. When the linker creates your final program image, it will put all the stuff in CODE together, and all the stuff in DATA together, and the linker configuration will determine which segments come first and which come last (or in the middle if you are using even more segments).

In the end, segments allow you to write your code in a way that is organized for humans to read it, but that will be better organized for the computer later. This is especially useful when you are writing code that will be eventually burned into a ROM. You can't just intermingle code and data, so in that case you can have CODE that is read only and suitable for a ROM, and DATA that is writable and sitting at some RAM address.

You can do all of this without segment support in the assembler, but segments help you do less bookkeeping.

I hope that helps.

  • Thanks 1
Link to comment
Share on other sites

  • 0

If you create a file ca65test.asm that includes the code from above and assemble it and create a listing, it shows:

Quote

000000r 1               .data
000000r 1  01 02 03     values: .byte 1,2,3
000003r 1               .code
000000r 1  AD rr rr     lda values+0
000003r 1  AE rr rr     ldx values+1
000006r 1  AC rr rr     ldy values+2
000009r 1               .data
000003r 1  74 68 69 73  msg: .byte "this is a test"
000007r 1  20 69 73 20
00000Br 1  61 20 74 65
000011r 1               .code
000009r 1  EA           nop
00000Ar 1  EA           nop
00000Br 1  EA           nop

Note that the rr byte values are unknown because we won't know their final value until we link the object file into a program. On line 1 it shows offset 000000, as that is the index into the current segment at the beginning of that line. The ".data" directive doesn't generate any bytes for the program, so that offset 000000 is unchanged on line 2. Line 2 created three bytes, so on line 3 the offset has changed to 000003, where ".code" changes the segment, so on line 4 the offset is back to 000000 because it is a different segment.

The linker will take that object file (and other information) and create a binary file that could look something like this:

Quote

AD rr rr    ;lda values+0
AE rr rr    ;ldx values+1
AC rr rr    ;ldy values+2
EA           ;nop
EA           ;nop
EA           ;nop
01 02 03     ;values: .byte 1,2,3
74 68 69 73  ;msg: .byte "this is a test"
20 69 73 20
61 20 74 65

The link reorganized the segments so they're in a "good order" for the machine so it can start executing the program and all the code is together, and all the data is together after the code.

Again, this is just an example for illustrative purposes. 

  • Thanks 1
Link to comment
Share on other sites

  • 0

It's also nice to know how you define the segments used by the .SEGMENT statement.

This is done in config files. There is also a manual on writing these config files: https://www.cc65.org/doc/ld65-5.html

The default config file for X16 assembly programming is cx16-asm.cfg. As may be seen, the default segment for program code ("CODE") has a load address of "MAIN", which in the MEMORY section is defined to start at address %S, which is a shorthand for the start address defined in the FEATURE section at the beginning of the file. At first you may think it's just too complicated fiddling with all these settings and config files, but it really makes programs more manageable, and it's especially easy to change the memory layout of a program if need be.

;THIS IS AN EXTRACT FROM cx16-asm.cfg, some non-memory management settings excluded
 
FEATURES {
STARTADDRESS: default = $0801;
}
SYMBOLS {
__LOADADDR__: type = import;
__HIMEM__: type = weak, value = $9F00;
}
MEMORY {
ZP: file = "", start = $0022, size = $0080 - $0022, define = yes;
LOADADDR: file = %O, start = %S - 2, size = $0002;
MAIN: file = %O, start = %S, size = __HIMEM__ - %S;
}
SEGMENTS {
ZEROPAGE: load = ZP, type = zp;
EXTZP: load = ZP, type = zp, optional = yes;
LOADADDR: load = LOADADDR, type = ro;
EXEHDR: load = MAIN, type = ro, optional = yes;
LOWCODE: load = MAIN, type = ro, optional = yes;
CODE: load = MAIN, type = ro;
RODATA: load = MAIN, type = ro;
DATA: load = MAIN, type = rw;
BSS: load = MAIN, type = bss, define = yes;
}
  • Thanks 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...

Important Information

Please review our Terms of Use