[Helma-user] using built-in db

Joshua Paine joshua at papercrown.org
Wed Jul 25 02:32:48 CEST 2007


Maksim Lin for technical support mailling lists wrote:
> of course it hardly a fair comparison

*grin* But really, I was under the impression (which daily work also 
bears out) that NTFS has a much harder time with many-item directories 
than ext3. It looks like for up to 10k objects at least I don't have to 
worry about my production environment, and the dev environment will at 
least be tolerable.

> I guess we could do the same by creating 100 folders named 00 to 99 and
> then limiting the files in each folder to be from 000 to 999, though
> that would be setting a "hard" limit of 100k objects

Git's file names are hashes (128- or 256-bit, I think, so 256 folders 
with effectively limitless # of files), so the distribution over the 
first two digits will be approximately random. Since helma's filenames 
are sequential, it makes more sense to use the last two digits, giving 
you as even as possible distribution between folders and no hard upper 
limit.

> if you were going to larger data seets, starting to use a sql db
> would really make more sense.

Yeah, most likely. Still, if this project data modeling goes well, I may 
end up trying larger sites on the built-in DB and see what happens. 
Unless Hannes pops in and tells me I'm nuts :-). I don't have a whole 
lot of objects this time, but the model still is probably about as 
complex as the larger site I'm thinking of.

-Joshua


More information about the Helma-user mailing list