desertfish Posted February 26, 2021 Share Posted February 26, 2021 File based assembler View File File-based assembler. Requires r39 or newer Source code and list of features is here https://github.com/irmen/cx16assem The instructions are fairly self-explanatory, a simple manual will come later assem-2021-12-14_01.26.48.mp4 Submitter desertfish Submitted 02/26/21 Category Productivity Apps 3 Quote Link to comment Share on other sites More sharing options...
Stefan Posted February 27, 2021 Share Posted February 27, 2021 Nice progress @desertfish! I mostly works as expected when I made my own simple "hello world". I also put the string at the end of the code as in your own hello world test. I tried to load each character in the loop with lda message,x and lda message,y, but that failed. Apparently the assembler did not recognize a label defined later in the code. It worked fine when I changed "message" to a fixed hexadecimal address. If symbols with 32 characters would become too heavy, you could always do what's been done in other languages, for instance Forth. Store only the first few characters of a symbol in the symbol table Also store some other metadata in the symbol table, for example the length and/or a checksum, thereby minimizing false duplicates This could save space and speed up assembly 1 Quote Link to comment Share on other sites More sharing options...
desertfish Posted February 27, 2021 Author Share Posted February 27, 2021 Hi thanks for trying it out. Can you post the failing program? Because it should be able to deal with undefined symbols Quote Link to comment Share on other sites More sharing options...
Stefan Posted February 27, 2021 Share Posted February 27, 2021 * = $8000 CHROUT = $FFD2 LDY #0 LOOP: LDA MSG,Y BEQ EXIT JSR CHROUT INY BRA LOOP EXIT: RTS MSG: .STR "HELLO, WORLD" .BYTE 0 Quote Link to comment Share on other sites More sharing options...
desertfish Posted February 28, 2021 Author Share Posted February 28, 2021 I've fixed the issues in a new upload. Turned out it wasn't correctly handling absolute-indexed with symbols About the symbol table: how does Forth deal with collisions? prefix+length is way to simple ( "name1" and "name2" will be the same entry) and adding a "checksum" will only work to a certain extent if you mean "hash", I suppose. Still there is no guarantee that we don't have collisions. Quote Link to comment Share on other sites More sharing options...
Stefan Posted February 28, 2021 Share Posted February 28, 2021 According to the book Starting Forth, the Forth-79 standard allowed symbol names of up to 31 characters. But some variants of Forth only stored three characters + the length of the symbol. "name1" and "name2" would then be the same, which is not ideal. I don't suggest that you copy that approach as is. It's more an inspiration. There's always the risk for collisions. Checksums/hashes are probably better than symbol length. Only storing the first three characters is probably too little. Quote Link to comment Share on other sites More sharing options...
desertfish Posted February 28, 2021 Author Share Posted February 28, 2021 Well, a correct hashtable implementation will deal with collisions (colision lists or other solution) Not dealing with possible collisions will result in extremely hard to track down bugs in your resulting machine code. Without any warning certain symbols suddenly will pick up the value of others... I can't imagine that's acceptable in forth either. I assume it uses a trick to deal with this as well Quote Link to comment Share on other sites More sharing options...
BruceMcF Posted February 28, 2021 Share Posted February 28, 2021 6 hours ago, Stefan said: According to the book Starting Forth, the Forth-79 standard allowed symbol names of up to 31 characters. But some variants of Forth only stored three characters + the length of the symbol. "name1" and "name2" would then be the same, which is not ideal. I don't suggest that you copy that approach as is. It's more an inspiration. There's always the risk for collisions. Checksums/hashes are probably better than symbol length. Only storing the first three characters is probably too little. In the Forth's that stored 3 characters (eg, the original FIG Forths), they dealt with collisions by the first one defined was the one stored in the dictionary, "good grief, keep track of what you are doing, you idjit" ... similar to CMB Basic variable names except different length names with the same first three letters were also distinct ... in ANS Forth and successors, some implementations use hashing to speed up dictionary searches, but the entire name is stored. Quote Link to comment Share on other sites More sharing options...
Stefan Posted February 28, 2021 Share Posted February 28, 2021 5 hours ago, desertfish said: Well, a correct hashtable implementation will deal with collisions (colision lists or other solution) Not dealing with possible collisions will result in extremely hard to track down bugs in your resulting machine code. Without any warning certain symbols suddenly will pick up the value of others... I can't imagine that's acceptable in forth either. I assume it uses a trick to deal with this as well Unless you plan to allow redefined symbols, the assembler could throw an error when it encounters a duplicate definition, whether an actual duplicate or a false match. Quote Link to comment Share on other sites More sharing options...
desertfish Posted February 28, 2021 Author Share Posted February 28, 2021 That is an interesting idea. Sometimes "good enough" is good enough and we can think of a different symbol name to satisfy the assembler. 1 Quote Link to comment Share on other sites More sharing options...
Stefan Posted February 28, 2021 Share Posted February 28, 2021 Sure is. Any idea about a suitable hash function? Quote Link to comment Share on other sites More sharing options...
desertfish Posted February 28, 2021 Author Share Posted February 28, 2021 No, not really... All good string hash functions use multiplications and we can't do that on the 6502 and also still keep it fast... 1 Quote Link to comment Share on other sites More sharing options...
Stefan Posted March 1, 2021 Share Posted March 1, 2021 Maybe it's worth looking at CRC-16 and CRC-32. Probably not ideal, but seems to be easily calculated. Is it good enough for your purpose? One implementation is found here (without lookup tables): http://www.6502.org/source/integers/crc-more.html And another here (with tables): http://www.6502.org/source/integers/crc.htm Quote Link to comment Share on other sites More sharing options...
BruceMcF Posted March 1, 2021 Share Posted March 1, 2021 22 hours ago, desertfish said: No, not really... All good string hash functions use multiplications and we can't do that on the 6502 and also still keep it fast... The question is what you are aiming for with the hash. A hash that AIMS to avoid collisions, so collisions are a special case, is one approach, another is to just accelerate things compared to a sorted linked list or binary tree by having more but substantially smaller linked lists or trees, so that they do not get bogged down to the same extent as the wordspace grows. Then something as simple as the bottom four bits of the XOR of the bytes in the name may serve well. Quote Link to comment Share on other sites More sharing options...
desertfish Posted March 1, 2021 Author Share Posted March 1, 2021 The current symbol table implementation is simplistic but should be easily replaceable with a different smarter one. Because it has a very basic interface to the assembler logic itself. I don't think I have the time to build a smarter symboltable myself, so hopefully someone else can jump in, who knows 1 Quote Link to comment Share on other sites More sharing options...
Terrel Shumway Posted March 2, 2021 Share Posted March 2, 2021 (edited) On 2/28/2021 at 7:07 AM, Stefan said: Sure is. Any idea about a suitable hash function? https://en.wikipedia.org/wiki/Linear-feedback_shift_register#Galois_LFSRs https://github.com/eternal-skywalker/cx16-lib/blob/main/lfsr.s David Murray mentioned using an LFSR to generate random maps in a game instead of manually creating and storing them. A hash function needs more than this, but it is something to start with. Edited March 2, 2021 by Terrel Shumway Quote Link to comment Share on other sites More sharing options...
desertfish Posted April 6, 2021 Author Share Posted April 6, 2021 Updated the assembler, added the feature to save the assembled program to disk. (note that assembling is still done into system memory first as intermediary step, this is something that will be changed in a future version to allow to assemble larger programs) Quote Link to comment Share on other sites More sharing options...
desertfish Posted April 18, 2021 Author Share Posted April 18, 2021 Updated again to cache the source file in memory, resulting in a large speedup because the file doesn't have to be read twice anymore. For now, source file size is now limited to 62 Kb because of this change. Quote Link to comment Share on other sites More sharing options...
desertfish Posted December 5, 2021 Author Share Posted December 5, 2021 (edited) Update again with new file load routines. Note that a patched V39 ROM is required to run this correctly because it depends on the kernal's LOAD routine to work correctly across ram banks. The framework for loading multiple files is now in place and we have ample RAM to store them into - we're now using hiram banks so we can store hundreds of kb of source files. So the next thing to do in the next version is to implement some sort of .INCLUDE "file.asm" directive to be able to read from multiple source files. Edited December 6, 2021 by desertfish 1 Quote Link to comment Share on other sites More sharing options...
Stefan Posted December 5, 2021 Share Posted December 5, 2021 Hi! I tried your new version. After some fiddling, not properly reading the instructions, and so on, I got it to work. I wrote a simliar test hello world program to yours, and it worked great. I could also restart the assembler without reloading it from disk after compiling, loading and testing the assembly program. Very nice job so far! 1 Quote Link to comment Share on other sites More sharing options...
desertfish Posted December 6, 2021 Author Share Posted December 6, 2021 Thank you Stefan. As long as you don't load the resulting output program, or loading it into a unoccupied piece of RAM as to not overwrite the assembler itself (so outside $0801-$5000 ish, look at the load addresses of the assembler program) you can indeed simply restart the assembler to continue editing or assembling code. (prog8 programs generally are restartable after exit). Quote Link to comment Share on other sites More sharing options...
ZeroByte Posted December 6, 2021 Share Posted December 6, 2021 On 12/6/2021 at 8:28 AM, desertfish said: (prog8 programs generally are restartable after exit). I noticed that about cc65 - the programs generally won't run a second time, and that definitely should not be the case. Any idea what cc65's binaries are doing to bork things up? In the grand scheme on a system like this, it's not such a big deal for games to have this behavior, as typically there was no "quit to basic" option in the game - you just flipped the power switch to get back to BASIC when you were done. But for applications that work on a file, it's definitely broken for a program to exit and then not be runnable a second time. Quote Link to comment Share on other sites More sharing options...
desertfish Posted December 6, 2021 Author Share Posted December 6, 2021 Yeah I think they either use some modifying code that can't run multiple times or forget to reinitialize variables. Prog8 doesn't have uninitialized variables and everything is re-initialized to their initialization value when the program is restarted. Quote Link to comment Share on other sites More sharing options...
Stefan Posted December 7, 2021 Share Posted December 7, 2021 @desertfish Please let me know if you have thought about any changes to X16 Edit in order to make it work better with your assembler. Quote Link to comment Share on other sites More sharing options...
desertfish Posted December 7, 2021 Author Share Posted December 7, 2021 (edited) I haven't really thought about that to be honest. The only thing that occurred to me is that you could make a version that uses ZeroByte's fixed v39 kernal rom to use LOAD instead of a CHRIN based file read loop to load large files much faster, like I did in the assembler. Saving will still be slow because SAVE doesn't yet work with banked ram. Also that version, like my assembler now, would only work with the patched ROM... If the patch won't get merged we'll be stuck with non working software Edited December 7, 2021 by desertfish Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.