Jump to content

A Tokenizer in RAM Bank 1


rje
 Share

Recommended Posts

This spring I had written a tokenizer, and pieces of a compiler, and pieces of an interpreter, in C on the X16.  I was getting to the hard parts, and its binary was 25K with no sign of letting up, when I put it on the shelf for awhile.

The source and token stream are living on Bank 1, but maybe I've got it backwards... Now I'm wondering if perhaps the tokenizer can live in Bank 1.

Anything to free up main RAM...

 

Edited by rje
  • Like 1
Link to comment
Share on other sites

An advantage of a tokenizer in Bank 1 is that if it outgrows the bank, you can either take "longwinded" routines and put them in Bank 2 or modularize and split modules between Bank 1 and Bank 2 ...

Cross module call, dispatch vectors in $BFxx, CALLXB in $BFxx, Y free:

... : LDX #operation : JSR CALLXB ...

; Bank 1 version
CALLXB: LDA #2 : STA $0000 : JSR + : LDA #2 : STA $0000 : RTS : + JMP ($BF00,X)

; Bank 2 version
CALLXB: LDA #1 : STA $0000 : JSR + : LDA #1 : STA $0000 : RTS: + JMP ($BF00,X)

The magic is, of course, that at "STA $000", you bounce to the other bank's version.

That is a lot more overhead than a subroutine call, but less than a general purpose cross bank call, so it makes program units of either 8KB or 16KB and the distinct units can have the additional overhead of a more general purpose cross bank call.

Edited by BruceMcF
  • Like 1
Link to comment
Share on other sites

Well I can have a 1K orchestration program in main RAM that knows how to toggle between banked routines, when it comes to that.

 

The tokenizer is 5.5 k.  Plenty of room.

Tokens are 5 byte structures:

typedef struct {
   uint8_t   type;
   uint8_t   length;
   char     *start_position;
   uint8_t   line;
} Token;

Scripts are limited to 256 lines.  The parsed source is diced up into strings and used in situ.

 

8-Shell, my current 25K attempt at the tokenizer + compiler + interpreter, is a mess, but the intro screen is pretty:

177395817_ScreenShot2021-09-02at10_57_22PM.png.f330fb8c5e3ac499473782c156ede886.png

and it can evaluate SOME expressions:

414397568_ScreenShot2021-09-02at11_00_02PM.png.dff32d4df9ebc7eadcc7f02029ef594e.png

 

 

and the logout screen reads a random entry from a FORTUNE file (8K, stored in yet another bank):

34076016_ScreenShot2021-09-02at10_58_46PM.png.902b43125ed659f3cc8e6828e4825944.png

Edited by rje
  • Like 2
Link to comment
Share on other sites

7 hours ago, rje said:

Well I can have a 1K orchestration program in main RAM that knows how to toggle between banked routines, when it comes to that.

The tokenizer is 5.5 k.  Plenty of room.  ...

The point there is not general purpose toggling between banked routines but minimizing the overhead in doing so for the cases when you know it's from one specific one to one specific one.

So in this case ... something that fits within 8KB would not use it, of course!

But you can still hold onto that 2.5KB spare space in case one of the other units (interpreter or compiler) is bound by the 8KB bound of a single bank and splitting out one or two longer subroutines, or a distinct submodule that is not in a tight inner loop (eg, an initialization module) to a side bank helps it fit comfortably.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...

Important Information

Please review our Terms of Use