Jump to content
  • 0

"Lots" of data files


rje
 Share

Question

This is the contrapositive version of the "Big" data file question.

One of the BASIC programs I submitted to x16-demo is a Traveller game, where the player jumps around a subsector of space, trading goods.  Because I stored all data in DATA statements, the program reached about 38k in size -- a major accomplishment if I do say so!  But that's okay, it was a proof of concept.

Next step is to put data into a binary format that I could load into banked RAM.  Now, I'm thinking that I can dynamically load data as I need it from a set of files on SD.

For example, say I have 2,000 star charts, and each is 8K in size.  I don't need to shove them all into RAM; I only need the "current" chart, and perhaps its neighbors.  Does that sound reasonable to everyone?

 

Edited by rje
Link to comment
Share on other sites

8 answers to this question

Recommended Posts

  • 0

If you group the star charts in 4x4 groups, that's 125 files for 2000 star systems, and each file is 128K, occupying 16 segments. You could have a rectangular master map and compute which cluster you need when leaving a system that is on the edge of a cluster.

And with only 125 files, each "cluster" map can be indexed with a single byte, so you can have a linked master map and aren't limited to a rectangular master map if you don't want to be.

  • Thanks 1
Link to comment
Share on other sites

  • 0
  • Super Administrators

The best way to do that may be to have a single, large data file, and you seek to specific records to load the systems you're interested in. 

Page 26 of the VIC-1541 manual explains how random access files work in CBM-DOS:
https://www.commodore.ca/manuals/pdfs/commodore_vic_1541_floppy_drive_users_manual.pdf

However, I'm not sure this will be implemented any time soon in the emulator. So the next-best system would be a variation of what Matt has suggested. Create a set of 10 subdirectories: \0\, \1\, \2\, \3\, etc... then place 100 files in each directory. Directory 00 would be files 000-099. 100-100, etc. 

I'm also not sure there's even a need to store more than one star system in memory. If you look at Elite, for example, you'll probably do all of your hopping around in one system at sublight speed, then warp to another system to continue your trading. That warp jump is actually your loading screen, which is where you load the destination system. 

 

  • Thanks 1
Link to comment
Share on other sites

  • 0
On 8/9/2020 at 9:04 PM, BruceMcF said:

If you group the star charts in 4x4 groups, that's 125 files for 2000 star systems, and each file is 128K, occupying 16 segments. You could have a rectangular master map and compute which cluster you need when leaving a system that is on the edge of a cluster.

And with only 125 files, each "cluster" map can be indexed with a single byte, so you can have a linked master map and aren't limited to a rectangular master map if you don't want to be.

Yes, I've gotten highly geeky over storage.

 

I settled on the "subsector", which is a fundamental unit of Traveller interstellar space.  It's basically a game map of 8x10 parsec-sized hexes, each of which could hold a star system.  It's enough room to support interstellar travel plus local fun.

The star system itself is encoded in about 48 bytes -- enough space for useful information for travel, trade, exploration, combat.

Thus, a subsector is 3840 bytes of star system data, plus a 256 byte header for metadata.  4K exactly.

There are 2,000 mapped subsectors in the default supported timeline for "Charted Space".  You can view them here:  http://travellermap.com

 

So that's 8 megabytes.

 

But you're right -- I *could* group data in sectors, which is a 4x4 matrix of subsectors.  I reckon that would take up 64K, not 128K, but same difference, and would (as you note) reduce the file count drastically, to a manageable 125.

 

I note that loading files into the X16 is quite fast.  Maybe loading a sector at a time isn't so bad.

 

There's also a procedural process for generating system data, so that's an easier way to plug in data.  But OF COURSE I want to do things properly and support existing data... and y'know, there's multiple periods in time, each with different data... etc.  It's a fun task.

 

 

  • Like 1
Link to comment
Share on other sites

  • 0
On 8/19/2020 at 9:09 AM, rje said:

I note that loading files into the X16 is quite fast.  Maybe loading a sector at a time isn't so bad

 

Yes, that's the trade-off. If a sector at a time doesn't load fast enough, then you could cluster them into 2x2 quarters of a sector, making a 16K file. If course, you have 500 of them ... sectors are a better unit if they are workable ... but if loading time for a sector is an issue, loading the quarter that you are entering and then using a set portion of the game cycle to pre-buffer the rest of the sector would be straightforward, and only require 64K of High RAM.

So up to sector size, the biggest chunk that can load from SD card without file loading time being a noticeable issue, and if that's smaller than a full sector, buffering in the rest of the sector as you go is one strategy to consider.

Link to comment
Share on other sites

  • 0
On 8/7/2020 at 5:32 PM, TomXP411 said:

The best way to do that may be to have a single, large data file, and you seek to specific records to load the systems you're interested in. 

...the next-best system would be a variation of what Matt has suggested. Create a set of 10 subdirectories: \0\, \1\, \2\, \3\, etc... then place 100 files in each directory. Directory 00 would be files 000-099. 100-100, etc. 

This looks like the best two options.

Let's say I need to be able to display a worst-case scenario of 50 systems surrounding the "current position", but it's usually fewer than ten systems.  I'll put those systems into RAM so I can run various queries on them.

Calculation isn't a problem: I know the coordinates I need to fetch from.  So this is merely a file storage exercise.

The "large file" scenario is the best one, because then I can have a SEEK-based moving window.

Alternately, subdirectories could be built based on the coordinate system itself.

 

Edited by rje
Link to comment
Share on other sites

  • 0

I suffer from premature optimization. 

I'm continually delayed when thinking about data storage for my Traveller-Trader game. 

I've done and re-done the same structure many, many, many times, compressing some data while leaving other data alone, thinking about what's easy to use versus what representation reduces wasted memory.

First, I'll show you the uncompressed, raw data.  Then I'll show you where I've been taking it.

 

The unit of data is the Universal World Profile, which actually includes some star system information, as well.  It is data sufficient to run the game, and to add a bit of useful in-game descriptive color.  It looks more or less like this:

	[03,10, Regina, A,788899,12, N,S, Ag Ri Cp Sa, G, 138, D, Im, G0 V, M0 D]
	

Now here's one form of the record I devise for these.

	The map is typically a 50% utilized 8x10 sparse matrix; I like to store the "empty" locations, because just because there's no star doesn't mean nothing's there.  In other words, I can hide objects there.

$00-0F "Regina" is the name, and I allow 16 bytes for the name.
$10    "A" is the starport code, with a range of [A,B,C,D,E,X].
$11-$13 788899 are 4-bit values representing world characteristics.  I store them in 3 bytes.
$14     "12" is the tech level, with a range from 0 to 34.
	"N,S" are bases.  System may have up to 2 of these from a restricted set of codes.  I typically combine them with the starport into a one-byte index.
	$15-$16 "Ag Ri Cp Sa" are trade comments.  I pack these into bitfields across two bytes.
	"G" is the trade safety rating, from the range [Green, Amber, Red].  I usually pack it with the trade comments.
	$17     "13" are digit codes for number of planetoid belts (1), number of gas giants (3), and number of other worlds (8) in the system.  I typically reduce these into one byte.
	$18     "D" is the highest ranking nobility present, and store it in one byte.
	$19     "Im" is a code for the empire this world belongs to ("Imperium" in this case).  I store the two characters as is.
	$1A-$1F "G0 V" and "M0 D" are the two stars in the system.  These fit nicely into six bytes to store up to three stars.
	

A 32 byte record is very tidy.  With a 256 byte header the subsector map fits in 2816 bytes.  When I need to read in four of these, I'm only reading 11K.

On the other hand, it's a tight space.  I was also thinking about a 48 byte record size, which would make the subsector map exactly 4K in size.  Reading four of those in would take 16K... not much more than 11K.

X16 Considerations

Either way, if I read in four subsector maps, I'd be using two RAM banks.  Therefore it seems better to go for the larger record size and have a little extra wasted space.

Anyone have suggestions?

Even if it's "STOP WORRYING ABOUT IT".

 

 

 

 

 

Link to comment
Share on other sites

  • 0
On 9/11/2020 at 10:35 PM, rje said:

I suffer from premature optimization. 

I'm continually delayed when thinking about data storage for my Traveller-Trader game. 

...

A 32 byte record is very tidy.  With a 256 byte header the subsector map fits in 2816 bytes.  When I need to read in four of these, I'm only reading 11K.

On the other hand, it's a tight space.  I was also thinking about a 48 byte record size, which would make the subsector map exactly 4K in size.  Reading four of those in would take 16K... not much more than 11K.

X16 Considerations

Either way, if I read in four subsector maps, I'd be using two RAM banks.  Therefore it seems better to go for the larger record size and have a little extra wasted space.

Anyone have suggestions?

Even if it's "STOP WORRYING ABOUT IT".

Code space is scarcer than data storage space ... go for the format that is easier to code.

A subsector map of exactly 4K is very appealing, because it means you are either reading the "lower subsector" or the "higher subsector" in any given HighRAM bank. Simplicity like that generally implies a smaller code footprint. And a smaller code footprint is respecting the relatively availability of LowRAM vs HighRAM.

  • Thanks 1
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...

Important Information

Please review our Terms of Use