Jump to content

Advanced PRG File Format


TomXP411
 Share

Recommended Posts

This is my proposal for an advanced PRG file format. I'd like to place it here for discussion. 

The format consists of a 64 byte header and an 8 byte block header.

File header is

16 FF  - the magic value. Regular PRGs have the address first, and this is an illegal address. So the loader knows to handle this as an advanced PRG file.

01 00 00 Format version

01 00 00 File version (you can increment this as you save newer versions of your program)

56 bytes of free space. You can use this for internal revision control, but the primary intent is to store a text description of the file/program in here. 

Block header:

01: 01 - Main Memory, 02 - VERA

02-03: Block length. Two bytes, little Endian

03-04: Unused

05: Bank number (VERA or high memory bank)

06-07: Load Address

07-length+8: data

A more thorough specification is here: https://docs.google.com/document/d/1q4IhDcdZ12B9EqySmHs5tHblFYIzGlzmPuHl3QkiUL0/edit?usp=sharing

 

I wanted to discuss the format and possibly extend the CX16 ROM and write a C packer for Advanced PRG files. 

 

 

  • Like 2
Link to comment
Share on other sites

Some questions have come up in another thread:

Quote

Why not use Executable And Linkable Format?

ELF is a both larger and more complex than we need for the commander. APRG is as small as we can make the container format, while still carrying necessary information, such as load address and block length. Each block header only uses 6 bytes: Type, address, and length. I padded 2 bytes just to make the header 8 bytes long.

Quote

 Is it meant that the loader runs after the program has started?

If we get this into the KERNAL, the loader would be executed by the LOAD command. It would analyze the first two bytes of the file and decide from there whether to treat this as a single block or switch to APRG mode. 

Quote

As block type $10 seems to be something where the player and the loaders works together.

Blocks other than 1, 2, and 3 are not handled directly by the loader. Instead, your program would re-open the data file and read blocks for graphics, sound, game levels, or other data. Think of this as something like a ZIP file, where all of your program's data is combined into a single file. You can use the Directory block (04) to locate your individual chains in the file. 

Again, in an ideal world, the services to read the directory and seek to individual blocks would be handled by the KERNAL. However, I suspect that's too much to ask, so I'm hoping our reference implementation will include pre-made subroutines that programmers can tack on to their program in either assembly or C. 

Quote

As for the PC side (compiler) instead of writing in many languages you could look into golang, you can write one implementation that can run on all platforms. As for the reference loader I guess the ultimate goal is to get it into the ROM - so assembly is perhaps the only way to go?

I intend to write the packer in C. This will cover every platform used for writing for the Commander - including the Commander itself. 

 

Link to comment
Share on other sites

Not saying these aren’t valid points, but it is unnecessary. Does it have value. For a multi-task operating system sure. But for a system where each program actually doesn’t have to concern itself with other programs it’s benefits are minimal. Why create special file types when your program can handle its own data just fine and can just put all its relevant stuff in the same directory. Accomplishes the same goal with much simpler code.

 

Also technically parts of GEOS are in ROM and GEOS can do much of what you are describing. We have done some tests and while more work needs to be done in simplified terms we can run GEOS software.

 

 

Sent from my iPhone using Tapatalk

Link to comment
Share on other sites

I think a load format for banked data is very useful and it should be in the ROM.

That way we can just load a file and have it placed in memory without having to re-invent the wheel by providing custom loaders. Does it prevent anyone from rolling their own? Not at all, you are free to load a plain binary and load data and program chunks from the file system as you see fit. I on my side prefer to have a binary file created by a tool and not need to bother how it gets loaded into memory, even when I need to use the bank system.

I like the ELF format myself as it rather simple to use when the requirements are simple, yet it provides a lot of flexibility and can be quite expressive when needed. Most of the complexity comes when adding debug sections or linkable sections, but that is not what we are doing here. I will describe it a bit below to give an idea what it is and show that it actually have a lot of similarities to the proposed format.

An ELF file has an ELF header that starts with a magic number (actually the string ELF) that identifies it as ELF. You only need to sanity check that it is 32-bit ELF, inspect the processor type and values related to program headers that describe where the program headers are in the file, their size and how many they are.

Each program header describes a chunk of memory to be loaded. The headers appear after each other. A header can describe an area to be cleared, or it can point to actual raw data to be loaded and it can specify a start address.

Thus, in a file you have:

-----------------
ELF Header
---------------
Program header 1
----------------
Program header 2
----------------
...
----------------
Program header n
----------------
Load data
----------------

The file can also contain sections, mainly for debug or vendor specific stuff, but you totally ignore them. Even if sections are in the file, you will automatically ignore them when you use the file offsets to locate the program headers and data.

For a banked system you could use a program header for each bank. The address is 32-bit so you can easily combine a bank number with the lower address.

If you compare this to the proposed format you will see that the structure is very similar. The ELF header is 52 bytes (in 32-bit ELF) compared to 64 in the proposed format. Each program header is 32 bytes in ELF and 8 in the proposed format.

The main difference is that in ELF the program headers are located after each other and contains a field that gives the file offset of the actual data.

A loader can interpret relevant fields and do some simple sanity checking, it does not need to be any more complicated. Here is a rough load algorithm:

  1. Open file, read ELF header and check the magic number, sanity check some fields in the ELF header to see that it is 32-bit ELF, for 6502 and that there is at least one program header. Store number of program headers, the size and keep a pointer to the current one, initialize it with the offset of the first program header as given in the ELF header.
  2. Seek to the next program header, load it and inspect type, read start address (if program type), size, offset in file to actual data.
  3. Load raw data from file, if the specified area is larger than what is in file, fill the rest with zero.
  4. Step to next program header, decrement header counter, if more program headers then go to 2.
  5. Jump to the start address of the program.

The main advantage of using an existing format like ELF is that it makes it possible to use other tools for inspecting and altering the executable file. It also avoids re-inventing the wheel again as someone already gave this some thought. We only need to specify how we use the format for the CX16.

  • Like 1
Link to comment
Share on other sites

I'm undecided as to how much we actually need this (meaning: I am totally unable to judge if it is a good use of possibly scarce rom space), but it seems like a good thing to have.
hth313's argument for ELF seems reasonable to me. OTOH, reinventing wheels is undeniably part of the fun in a project like this.

I'm curious to hear more opinions about this.

Link to comment
Share on other sites

10 hours ago, hth313 said:

Seek to the next program header

How does one do this though? As far as I am aware there is no concept of seeking in the Commander X16 DOS routines. 

So it will mean you'll have to read all required headers into memory and then manipulate them from there as you cannot seek back in the file to read more data from the beginning.

2 hours ago, Lorin Millsap said:

You can already load banked data.

What do you mean exactly?  There are no ROM routines to do this automatically for you without having to roll your own code

Link to comment
Share on other sites

I agree that ROM space could be scarce and therefore an advanced loader might not make it. But I find it still interesting, because if it is made available as a third-party tool it might still be useful (but less so than if it is already in ROM, admittedly)

13 minutes ago, desertfish said:

So it will mean you'll have to read all required headers into memory and then manipulate them from there as you cannot seek back in the file to read more data from the beginning.

That's what I am worried about, too. If I understand correctly, in ELF you would have to make it possible to save information from an unknown number of block headers, whereas in TomXP's suggestion you just need to know about the block you are currently reading. I also like the idea to use an illegal address as the format identifier. To me, his suggestion seems well suited for the X16.

I think, if the goal is to get the feature into ROM, we better hurry up 😉

Edited by kliepatsch
Link to comment
Share on other sites

It does seems useful to me but it's more important to (hopefully) get some small bugs ironed out in the rom code that exists now (there are a few pull requests on github already that I think are pretty easy and important to include...)

Link to comment
Share on other sites

It does seems useful to me but it's more important to (hopefully) get some small bugs ironed out in the rom code that exists now (there are a few pull requests on github already that I think are pretty easy and important to include...)

So use the GEOS headers.


Sent from my iPhone using Tapatalk
Link to comment
Share on other sites

If seeking is not available then ELF is out. I do not think is practical to read several program header structures, not knowing how many they are. 

I still think it is a good idea to be able to load a file into memory and have banks populated automatically. Can someone explain how GEOS helps to load a banked program at power on when I get the READY prompt?

Link to comment
Share on other sites

I think a loader like this is a good thing to have. I know I can make it up by separating each bank into separate files, but then I need to redistribute several files to accomplish the same goal. One file is much cleaner and safer.

The ROM portion of this will not be much as the ROM loader only will handle the segments that the loader can handle. The limit we have today with the PRG files is that it has to fit into memory.

The ability to have the usercode also work with the file is a plus (loading additional resources). For me it is not that important but as since there does not seem to be any extra cost (complexity/code) for the ROM based loader I think this is a good addition. It can make things easier for some people.

I also think that a relocatable file creates too much complexity/overhead and is not needed (at least for most use cases).

My vote is on this project.

  • Like 1
Link to comment
Share on other sites

I just had an idea: It might be enough to have a very simple file format that does not require us to load the whole file at once. This way you could implement all the loader code in your program and keep the additional rom usage at a minimum.
Something like: 2 bytes format identifier (illegal address), 1 byte flags (see below), 2 bytes target address, 2 bytes number of bytes to read (size of loader), loader code, other data
One bit in the flags would determine if the kernal should jump to the start address upon finishing (so you wouldn't need to type RUN and could omit the basic-sys-header stuff).
The loader code in the file could then reopen the file and load all the rest of the data as needed. Lacking the possibility of seeking, it would have to read and ignore the loader code before it can get to the rest of the data. I don't know how fast that would be, but I think it might not be that much of a problem. (Or an incentive to keep the loader code compact. ;-)) Another thought: Is it possible for the kernal to leave the file opened, so the loader could just continue reading at that point? This could then be controlled with another bit in the flags.

What do you think?

Link to comment
Share on other sites

On 1/1/2021 at 1:43 AM, TomXP411 said:

Block header:

01: 01 - Main Memory, 02 - VERA

02-03: Block length. Two bytes, little Endian

03-04: Unused

05: Bank number (VERA or high memory bank)

06-07: Load Address

I guess the address is little endian, right ?
Therefore, I would rather go for:
05-06: Load Address
07: Bank Number

Link to comment
Share on other sites

On 1/1/2021 at 5:14 AM, Lorin Millsap said:

...  when your program can handle its own data just fine and can just put all its relevant stuff in the same directory....

I think that's precisely what we're trying to avoid here ;oD
To be able to focus on our code, on the added value of our code and not to reinvent the wheel each time just to load some data.

 

Link to comment
Share on other sites

On 1/1/2021 at 5:14 AM, Lorin Millsap said:

Also technically parts of GEOS are in ROM and GEOS can do much of what you are describing. We have done some tests and while more work needs to be done in simplified terms we can run GEOS software.

if we can legally use GEOS, that's a way.
Also, to be more precise, I'm thinking about using some functions of GEOS here, without fully using it.
Let's say I'm writing a game, I want my own game system and to be able to use some system functions, like load data. Nothing more.

Link to comment
Share on other sites

On 1/1/2021 at 1:43 AM, TomXP411 said:

$80 Custom loaders will be implemented via a jump table. The jump table will consist of 3 bytes per entry, as follows:
$80 21 42 If block format is $80, jump to address $4221
$93 96 42
$00 Null terminates the jump table.

Why the 16bits address and not 24bits ? (bank + address)

Link to comment
Share on other sites

Posted (edited)
7 hours ago, Lorin Millsap said:

And BLOAD doesn’t work because?


Sent from my iPhone using Tapatalk

BLOAD doesn't allow you to load to non-contiguous memory. For example, load a 20K core and 32K of data into banked RAM. Right now, you'd have to save your data as a separate file and load each segment from your program. The APRG format allows for different parts of the program to be loaded to different places in memory, all in one file. This makes software distribution super clean and neat: you give the customer a single APRG file, with the code and data all in a single file. 

This doesn't seem like a big thing, but looking at all the different ways Commodore 64 software is bundled, I thought having a single container format from the beginning would make more sense and allow for better compatibility across different programs. 

I also was hoping this would be a way to coalesce a common set of standards, like SID and Koala were for the Commodore 64, or GIF and JPG are on the PC. By using a common image format with standardized metadata, we can create a common sprite, tile, and bitmap format that everyone can use in their games. Other things also benefit: music sequences (aka "SID" files) are another prime example. If everyone writes to a common audio format, then game makers can just use the pre-made loader, rather than having to roll their own.

This frees up developers to work on their games, rather than spending all their time building tooling. 

So the whole point is really sharing code - we write code once, and then everyone coding games and applications get to take advantage of that.

 

 

Edited by TomXP411
  • Like 1
Link to comment
Share on other sites

Posted (edited)
7 hours ago, kktos said:

Why the 16bits address and not 24bits ? (bank + address)

Code in one bank can't directly access data in another bank. 

If the code is in the same bank as the data, then you don't need to worry about the bank register, because it's already set by the loader. And if the code is in low memory, which is unbanked, you don't need a bank number. 

It doesn't matter, though - with the addition of the Directory block, custom loaders are moot, so I took that out. 

Now, if you need to load data at runtime, such as overlays, maps, or assets, you would consult the directory and load the target block on-demand. It's a cleaner approach from the application perspective, even if it adds a little complexity to the loader itself. 

Edited by TomXP411
  • Like 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...

Important Information

Please review our Terms of Use