Jump to content
rje

New community dev tool uploaded: 8sh

Recommended Posts

8sh

View File

https://github.com/bobbyjim/x16-8sh

This is the early early stage of an attempted "shell" for the X16.

By definition, a command shell is an interpreter that exposes the system.  In practice, the shell exposes the system through an immediate-mode scripting language, which can also be executed from a file.  The Commodore computers' boot mode is a kind of "BASIC shell".

This shell currently does almost nothing.  I am slowly working out a set of mid-level operations for it, and plan to add the shell's command inventory on top of that. Then, I'll add the ability to run full scripts from file.  Then, I'll add the ability to "pipe" the output of one script into another.  In this manner, I hope to build up a small inventory of useful utilities.

 

 

 

8SH.PRG


 

  • Like 1

Share this post


Link to post
Share on other sites

Well I can't run this thing from the "Run This Now!" button 😞 and I don't know why.

 

Share this post


Link to post
Share on other sites
1 hour ago, rje said:

Well I can't run this thing from the "Run This Now!" button 😞 and I don't know why.

 

Try renaming the file to have a .PRG extension.

  • Thanks 1

Share this post


Link to post
Share on other sites

Thanks Matt!  That didn't work either, but I'm sure it's just something silly.

 

Share this post


Link to post
Share on other sites

So I've poured some time into 8sh, and since I'm doing it carefully, of course that means it doesn't "do" anything yet.

It has three main pieces, in various states of completion:

(1) A working tokenizer which recognizes about half of the tokens I want.

(2) A compiler that doesn't work yet.  It will orchestrate the tokens and emit bytecodes.

(3) A bytecode interpreter ("VM") that currently only knows nine opcodes (out of 40-ish).

 

At some point, the compiler will *just* work, and then the whole chain will show signs of life.

Share this post


Link to post
Share on other sites

Wow, writing C on an 8 bit machine is ... well it's tricky I think.

I started out with a smallish but typical interpreter project, in C (https://craftinginterpreters.com/).  And I hit walls as memory got corrupted -- and I hit the limitations of C on 8 bitters, such as the small max size of structures, and the tiny stack (only return small things like pointers... or better yet, use global variables as much as possible), and how apparently easy it is for even global data to get lost and/or corrupted when a program gets too clever...

*** GOAL ***

Each time I hit a wall, I had to ask myself what my END GOAL was.  And I think it's not changed:

A "shell interpreter"...

with...

  • dynamic variables (numbers and strings... perhaps as C-shell-style arrays),
  • a pile of operators,
  • flow control (if, foreach, and while)
  • script "chaining" / output redirection (pipes, sort of)
  • 5 byte floats (not a priority)
  • maybe functions (not a priority)
  • tokenizer => compiler => bytecode interpreter

I use Banks for loaded scripts, tokenizations, and as a command line buffer.

*** PROGRESS ***

I'm busy paring down the tokenizer.  Each time I think I've got it down, something else happens.  So now I'm shoving input into a Bank and tokenizing from there.  I'm about ready to scrap the Token struct and instead shove token information into Banked RAM.

 

 

 

Edited by rje

Share this post


Link to post
Share on other sites

So Bank-based scanning is a success: input is stored in Bank 1, and the tokenizer works on that.  Six bytes of metadata per token is generated (type, length, start position, and line number) and stored in Bank 2.  Bank 1 is unmolested. Thus I can theoretically juke around the tokenized list by multiplying token number by 6; when I need to fetch actual data, the token metadata points me to the substring from the source bank.  A handy little utility lets me get and put strings, ints, and bytes in and out of banks.

Getting tokenization out of structs and into banked RAM has really paid off. 

Next step is getting the compiler to read from the banked token stream.  I think decoupling the tokenization from compilation will also pay off here as well, since the compiler already does a lot of work.  I am also now thinking that perhaps the compiler ought to write bytecode to a third bank.

Banked RAM is turning out to be more than just a luxury: it's becoming a "relatively cheap heap" for my program.  It's like the RAM banks are already allocated hunks of dedicated memory.  It's an extreme luxury.  It's also giving me a lot of cheap workspace, something I wouldn't really be able to get normally on machines like the Commodore 64.

And yeah, it's gonna all be glacially slow compared to assembly language.  If I get what I want out of this thing, I won't care.

 

 

Edited by rje

Share this post


Link to post
Share on other sites

For the moment, I will pretend that the X16 is nimble enough to manage an ever-growing bytecode list.  It may be that this memory will also get messed up; if so, then I will change this system to write to a third RAM bank.  This will also probably simplify the program more.  For now though I will use what I have already written: if it ain't broke...

 

Edited by rje

Share this post


Link to post
Share on other sites

PROGRESS

Arithmetic expressions are compiled and interpreted, but only the first full expression.

CURRENT WORK

I'm extending the engine to recognize more than just integers.

 

Share this post


Link to post
Share on other sites

PROGRESS

Added strings and logical operators.

CURRENT WORK

Adding a hashtable so I can store variables.

 

Share this post


Link to post
Share on other sites

PAIN and SUFFERING

This is what I get for trying to implement a standard C project built for modern computers onto an 8 bit platform.

Pain!  

So, the scanner is great.  Tokenization is easy, well understood, and so on.

I suspect the interpreter is probably okay, as far as I've gotten with it anyhow.

But the compiler is a mess.  Back to the books to understand what's going on.

Share this post


Link to post
Share on other sites

I'm letting this sit on the backburner, thinking about it, before I rewrite it.

I'm starting to become interested in looking at it again, but I have to ask myself what I WANT out of 8shell.

The current work was implementing an interpreter using an existing C project designed for modern systems.  As a result there are two problems:

(1) The codebase, though not extravagant, is really too "big" / requires too much, for the X16.
(2) Partly to head off problems with (1), I made decisions that are a bit incompatible with the original.   And of course, one change often cascades into many changes.  And so I ran out of energy.

Now I'm coming back to it and asking what I WANT, and then planning how to get there.

 

SOFTWARE CONSIDERATIONS

  • It's a Shell

    This means it's a prompt that you interact with.  Call it a REPL if you prefer.

    This means it IS a command interpreter, even if it's a wimpy one.  This means scripting scripts... things that let it access the file system and manipulate data sources.  AWK and sed and Perl are my guides here.  The result won't be as powerful as them, but if I can capture some of that power in a bottle I think we'll have something.
     
  • It has to be Useful

    Being able to sed or awk a file and pipe the output to a new file, or concat an existing file, would just be nice, fun, and might be useful.

    Throwing other little tools in the mix would enrich this capability.  One thing at a time, though, I suppose.

 

HARDWARE CONSIDERATIONS

  • 8SH is NOT made for a ROM Bank.

    8SH is greater than 16K, and thrives on calls to the KERNAL.  THEREFORE it is by definition something that lives in RAM.
  • 8SH SHOULD leverage RAM banks.

    Tokenizing commands means storing small tokenized scripts, probably in one RAM bank.  In other words, the tokenizer will happily churn along until you reach that 8K limit, at which point the interpreter will fail with a "Bank Overflow" command and force you to cut your scripts down in size and chain them or something.

    Why only one bank?  Because that simplifies things.  I can make it more complicated later.

    Similarly, hashtables will be shoved into other RAM banks... and might not even be hashtables.

 

 

Share this post


Link to post
Share on other sites

oof, this sounds like a complex beast.... it looks useful for some complicated tasks, but at the same time, it feels like overkill for a simple 8 bit system. Do you really want to write shell scripts (this is my interpretation of what you're proposing) for the system?

Share this post


Link to post
Share on other sites

Well of course I do!  

But: you're right, a lot of things shells do, don't apply here.

Actually, much of it is not complex.  Tokenizing into Bank 1 is easy.  Symbol tables / hashtables are easy. Even a stack-based VM seems easy.

The bit *I* find challenging is compiling to bytecode.  I think it's a recursive descent parser, with an expression engine, and a bytecode emitter with forward referencing for handling blocks.  Or maybe I can use a stack?

Building it in pieces is the way to get it done.  ... And perhaps also the way to find out what is useful and what isn't.

 

Edited by rje

Share this post


Link to post
Share on other sites

I'm sorry I didn't mean to talk down the idea.  By all means it would be a great tool, for some tasks, just personally I don't (yet) see them 🙂   That shouldn't stop you on this project ofcourse!

  • Thanks 1

Share this post


Link to post
Share on other sites

No no, I get it.  I mean, C is not the 6502's native language, and a bytecode interpreter is heavy by definition.

But it will be a personal achievement, and if it works it will be fun, and JUST MAYBE useful.

 

I was talking with a C64/C128 guy who was writing a very low-level symbol table, with hopes of writing something like AWK from it.  And that sparked my imagination.

 

Share this post


Link to post
Share on other sites
20 minutes ago, desertfish said:

it would be a great tool, for some tasks, just personally I don't (yet) see them 🙂  

Quoted for truth. 

On my UNIX machine, scripts are used for creating, processing, and reporting on data, usually in files.

It's totally overkill on the X16.

But, I want to see it; some ideas may present themselves.

Edited by rje
  • Like 1

Share this post


Link to post
Share on other sites

I'd be happy if it just worked like DOS - dir, delete, rename, move, cd, etc... and execute a file by just typing its name.
This should be able to replace the BASIC environment if you so chose - i.e. burn a ROM with this shell in one of the unused banks, and make it executable by a SYS call or something. You could pick and choose which Kernal calls to use to make your life easier in building the thing. Last, just build in the command to return to BASIC be 'basic' and then it just does RTS. 🙂

Share this post


Link to post
Share on other sites

You know, Zero, we've got disk commands in the current MIST-modified KERNAL...

 

Edited by rje

Share this post


Link to post
Share on other sites
On 4/9/2021 at 1:31 AM, rje said:

The bit *I* find challenging is compiling to bytecode.  I think it's a recursive descent parser, with an expression engine, and a bytecode emitter with forward referencing for handling blocks.  Or maybe I can use a stack?

One way to simplify the compilation is to simplify the grammer.

For example, suppose that ALL infix operations are evaluated left to right, and without parentheses all right hand side operands are evaluated first. So: 3+5*8-4/3 means:
 3+(5*(8-(4/3))). And suppose () are supported. Then you simply parse between operations and operands, push operands on an operand stack, push operations on an operations stack, and when you get to the end of the line you execute the operations stack. ")" goes ahead and executes the current operations stack, and "(" on the operations stack stops executing the operations stack and goes back to executing the bytecode.

So in practice, people get used to using () a lot.

Now function or procedure calls are just prefix operations, a "proc(" pushes the ( onto the operation stack then the proc, and in proc(x,y,z) the comma is just a separator: (x,y,z) gets executed as ( ... x y z ) on the operand stack and "proc" uses the top two entries on the operand stack. ")" doesn't care whether it is closing an expression or a procedure call, it just executes the current operation stack.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...

Important Information

Please review our Terms of Use