Jump to content
  • 0
rje

Hashtables with cc65

Question

Anyone have a hashtable implementation that works on 8-bit systems?

* * *

So my interpreter now has a hash table.  But, data seems to get scrambled when I insert it.

It's probably a simple, silly bug.

So I started a side project, porting Perl 2's hash table implementation to cc65.  If that works, I should be able to get the other one working.  They're both about the same source code size (~5k).

And THAT seems to not work.  I am either replicating a bug across dissimilar code bases, or something is just too complex, causing the X16 to scramble structures.

The latter could be true: these hash tables hold value structures that include a pointer and some bookkeeping.  It could be just too large for the 6502 to digest.  I can test that by making values simple char pointers and seeing what happens.

 

If you've got ideas I'm open.  Meanwhile I'll keep debugging.  I've updated the project-so-far: https://github.com/bobbyjim/x16-8sh

Edited by rje

Share this post


Link to post
Share on other sites

20 answers to this question

Recommended Posts

  • 0

Scrambled.... hum...

first glance: in hstore, you're simply assigning the string pointer rather than having a buffer for the string and copying the chars into.
That could means that mem used for the string is reused for something else.
Or you are mallocing every "value" string ?
let's add some printf* in the code to follow what's happening
*)the poor man debugger tool 😉
/kktos

 

  • Thanks 1

Share this post


Link to post
Share on other sites
  • 0

Yup good advices.  I've been printfing my fingers off -- so I could tell that the value assignment is called... and then disappears.

So yeah, the first thing that pops into my mind is that the malloc is being snubbed, and/or the memcpy is Doing The Wrong Thing.  The pointer is just floating away... now points to garbage... TOTALLY possible there.

 

Share this post


Link to post
Share on other sites
  • 0
3 hours ago, SlithyMatt said:

I don't think malloc really works with cc65/X16. You'll need to come up with a statically-allocated solution.

Perhaps that's the problem.... however it does appear to work ("at least some of the time"??).  And until this point, alloc'ed stuff appears to stick around okay -- VM bytecodes, stored constants and the like.   And the binary is above 20K now.  If malloc fails, it probably should have failed by now?  (BUT YOU NEVER KNOW).

I got Perl 2's hashtable code working - I had to convert the prototypes from K&R to current decls, et voila'.  (BTW, Perl2 was written back in 1988).  So the pointer stuff APPEARS to work... but I could be wrong.

* * *

I have also thought about a prealloc solution; namely, using one of the RAM banks as a binary-buddy hash-heap thing.  8K is probably enough (and hey if I had to, I could span multiple banks.... eeeeek).  But I won't go that route until I have to.

For your amusement, here are the Perl 2 files.  The Makefile works with cc65, easy peasy.

hash.c hash.h Makefile str.c str.h util.c util.h

main.c

PHASH8

Edited by rje

Share this post


Link to post
Share on other sites
  • 0

One thing I haven't done and should do, is some load testing on hash tables.  If I store 200 key-value pairs, do any of them get lost?  Etc.

 

Share this post


Link to post
Share on other sites
  • 0

OK, load testing done.  With large key-value strings (~100 bytes for overhead and data per pair), it's fine through about 224 entries.  With short key-value strings (maybe 32 bytes per pair), it's fine for more than 550+ key-value pairs.  I suspect both of these are exhausting the heap in low RAM, i.e. somewhere around 20K. 

We'll see if that's ok.  Sounds like memory will be tight (duh!), and I may have to put this into a RAM bank or two.  But first, I need to get the code working!

 

 

Edited by rje

Share this post


Link to post
Share on other sites
  • 0
13 minutes ago, rje said:

It begins dropping data somewhere between 550 and 600 entries

That sounds like it doesn't deal properly with bucket overflow?  Which means it will drop collisions. That can be scary?   I haven't looked at the code so it could very well be something else.

  • Thanks 1

Share this post


Link to post
Share on other sites
  • 0
1 hour ago, desertfish said:

That sounds like it doesn't deal properly with bucket overflow?  Which means it will drop collisions. That can be scary?   I haven't looked at the code so it could very well be something else.

Nah, collisions are fine -- this is Perl source after all (https://github.com/bobbyjim/perl2). 

I'm pretty sure it's blowing the heap.  Program code + heap < 40K?  The program is 7K; 224 entries at 100 bytes is 22K, call it 30K total but if there's 10K of slop in there, then *boom*.  Not unreasonable.

Also likely because different combinations of key/value sizes results in a scaled number of max entries.  So it's not collision, otherwise there would probably (?) not be a correlation.  Instead, it looks like this:

 

K+V bytes  Load (# entries)  Total bytes (appx)
=========  ================  ==================
  100          224                22k
   64          320                20k
   54          390                21k
   24          550                13k ?
   16          650                10k ?

Note that the last line there, where I manage to jam in 650 entries, is of minimal value, since each key and value is 4 bytes long.  The "K+V" value is probably low for all entries.

Note that values are approximate, and the "total bytes" values have a large error margin.

I also note that keys can't be more than 32 characters long, because that's when the hash function appears to break (on the X16).  Not worried about that though.

 

Edited by rje

Share this post


Link to post
Share on other sites
  • 0

By the way, both Perl -and- the ACTUAL codebase I'm following only use the hashcode as a starting index into the entry list... in other words, if that entry is taken already, then both systems just increment the index pointer and look at the next spot.  In other words, it searches kind of like a one-hop skip list, in a way, if that makes sense.  And they both grow when capacity reaches 70% or so.

So there are no buckets of sublists, just one big list.

HUH.  I just had a thought.  I should look at that growth method.  Does it double the list size?  If so, then that's what's REALLY failing.

 

...Yup.  Perl doubles the list size with a grow method called "hsplit()".  


	int oldsize = tb->tbl_max + 1;

	int newsize = oldsize * 2;      // THAT'S A PROBLEM ON THE X16


After a certain point, there's just not enough heap to allocate.  Boom.

Can't grow like that on the X16.  

 

Edited by rje

Share this post


Link to post
Share on other sites
  • 0

Now back to the original source I'm using (https://github.com/munificent/craftinginterpreters/tree/master/c).

The hash table's values are tightly woven into the VM itself, and as a result, in order to just test the hashtable code, I have to drag in the entire system.  

But on the other hand, I can clearly see that the value's memory is being dropped.  I can't clearly see WHERE, yet, but it's definitely falling into a black hole.

 

Share this post


Link to post
Share on other sites
  • 0

I've created another subfolder with a subset of the source files, just to test out the "actual" hashtable code.

And, yeah, it looks like the value-end is not being allocated.  Or rather, it's probably going into a local variable which is getting reclaimed when it goes out of scope.  A rookie mistake (in my defense, the hashtable source is intertwined with the VM, common memory management code, object code, string constants code, and so on).

 

Share this post


Link to post
Share on other sites
  • 0

Another very slow crawl, time permitting.  

This time through, I debugged the standalone hashtable code -- the main problem is that, as mentioned before, the code is tightly intertwined with the rest of the code; as a result, references were getting lost.  So it's working now, at least.

But, it's definitely inferior in load testing.  More on that in a moment.

But first, there is another bug in the reallocation code when the table grows from 64 to 128 entries: the first 48 entries lose their values for some reason.  It's a bit silly because that's like the fourth reallocation with data (0->8->16->32->64->128), so if there was a bug, you'd think it would show up before then.  And there's one more reallocation before the heap gives out (at 256 elements), but it works fine up until we exhaust memory.

So it's just when we get our 48th element, and the hashtable grows to 128 entries, that it loses the values for the first 48 entries.

It's a bug.  I'll find it.

* * *

More of a concern, though, is that I can't even squeeze 256 entries into the table before the heap exhausts itself.   Granted there's more plumbing here, and apparently that's important to support a lot of stuff in the script language itself... and I can live with it FOR NOW.  But I think where this is heading, is that ASSUMING I ever finish, I will have to rewrite the code to rely more heavily on banked RAM.  And, perhaps, I'll find elegant ways to slim the code down.

* * *

One thing I have been idly thinking about is rewriting the tokenizer in assembly and putting THAT in a RAM bank.  

If I manage to do that correctly, then I could look at rewriting the VM in assembly.

But first things first.

 

Edited by rje

Share this post


Link to post
Share on other sites
  • 0

I re-visited my interpreter's codebase, and I think I know the *general* problem.

The original code is vanilla C (C99 I think), written for modern computers.  

So I got fancy, starting with the tokenizer, and am storing the source code in Bank 1 -- which includes identified string constants -- and storing the token invariant structures in Bank 2 (they're 6 bytes each).  This has two or three benefits:

(1) I don't have to use the heap until the compiler kicks in.
(2) It's really trivially easy to traipse through the token stream.  Position = 6 x token number.  The token itself has an integer start position index back into the Bank 1 original source for strings.  It's a thing of beauty:
 

typedef struct {
   TokenType type;           // (it's a uint8_t)
   uint8_t   length;       
   int       start_position; // BANK OFFSET (NOT A POINTER ANY MORE)
   int       line;
} Token;

Except.

Except that guy's code assumes all this is allocated from the heap, which also has advantages:

(1) He can internalize string constants easily -- they're just in a hashtable.
(2) He can compare those internalized strings easily -- just use ==, because he's bijiggered the code so that common strings point to the same internalized constant string address.
(3) Other benefits.

This means when I got to the global-variables implementation, his codebase took things for granted that I've already deviated from.  

Toss in a little rustiness with C, and voila, I have to code my way out of this.

More as it occurs to me.

Edited by rje

Share this post


Link to post
Share on other sites
  • 0

I think I've localized and worked around a bug in the code, without actually finding the bug itself.  Therefore, I am guessing I did something wrong with a pointer in (if I'm lucky) one function.  Essentially the function which should fetch a value from a hash table isn't doing its job.  For the time being, I'm using lower-level search code.  If this is not a systemic error, and just a localized fubar, then I'm still OK.  

At any rate I am impatient and want to move along.  Rest assured, if there's a systemic problem it will impede progress.

 

Edited by rje

Share this post


Link to post
Share on other sites
  • 0
On 12/9/2020 at 7:54 PM, SlithyMatt said:

I don't think malloc really works with cc65/X16. You'll need to come up with a statically-allocated solution.

I gave malloc a shot and it seems to work on both X16 and C64. Example code below.

void try_malloc() {
unsigned int total = 0;
char *str[15];
unsigned char i;
unsigned int m_size = 3000;
const unsigned char reps = sizeof(str) / sizeof(str[0]);
 
// allocate memory 1. run
for(i = 0; i < reps; i++) {
str[i] = malloc(m_size);
if(str[i] != NULL)
total += m_size;
printf("Alloc %u: %04X (total %u)\n\r", (unsigned int)i, str[i], total);
}
 
// deallocate half
for(i = 0; i < reps; i++) {
if (str[i] != NULL && i % 2 == 1) {
printf("Free %u: %04X (total %u)\r\n", (unsigned int)i, str[i], total);
free(str[i]);
total -= m_size;
}
}
// allocate memory 2. run
m_size = 2000;
printf("malloc size: %u\r\n", m_size);
for(i = 0; i < reps; i++) {
str[i] = malloc(m_size);
if(str[i] != NULL)
total += m_size;
printf("Alloc %u: %04X (total %u)\n\r", (unsigned int)i, str[i], total);
}
}

Share this post


Link to post
Share on other sites
  • 0
3 hours ago, aex01127 said:
On 12/9/2020 at 1:54 PM, SlithyMatt said:

I don't think malloc really works with cc65/X16. You'll need to come up with a statically-allocated solution.

I gave malloc a shot

Did you check the debugger to see where the heap was placed? I have a feeling it is in a very limited part of main RAM, unless someone wrote an X16-specific version of malloc as part of the cc65 baseline.

Share this post


Link to post
Share on other sites
  • 0

THE DEBUGGER!!!     Yes only this week I realized that the X16 can run in debug mode.

When I circle back around to my little project here, I will be using that debugger.

 

  • Like 1

Share this post


Link to post
Share on other sites
  • 0
5 hours ago, SlithyMatt said:

Did you check the debugger to see where the heap was placed? I have a feeling it is in a very limited part of main RAM, unless someone wrote an X16-specific version of malloc as part of the cc65 baseline.

The heap comes right after the BSS segment that is the last part of the program. __HIMEM__ is defined as $9F00 on the X16. And CC65 allocates by default 2K for its own stack right before HIMEM. So $9F00-$0400-END_OF_BSS is what you have as available memory. My testprogram above ends at $1802. That gives me about 32k of usable memory. ($82FE to be exact.) This is also confirmed by the program above that manages to allocate approx 32k.

On the C64 the ROM is mapped out so you get almost 55k of usable RAM.

Share this post


Link to post
Share on other sites
  • 0
On 12/30/2020 at 11:26 PM, rje said:

THE DEBUGGER!!!     Yes only this week I realized that the X16 can run in debug mode.

When I circle back around to my little project here, I will be using that debugger.

 

rje, dunno the status of your endeavour but if you're still playing with the debugger, I've tweaked it a bit so I can use symbols... which is bloody useful to be able to sync the asm with the source 😉

Here is the PR: https://github.com/commanderx16/x16-emulator/pull/313

  • Like 1
  • Thanks 2

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...

Important Information

Please review our Terms of Use