[Helma-user] CouchDB beginnings

Joshua Paine joshua at papercrown.org
Thu Dec 27 22:30:34 CET 2007


Maksim Lin for technical support mailling lists wrote:
> In the past its been suggested using separate subfolders for helma DBs
> to help with scaling to lare numbers of objects and I've been toying
> with a few ideas but I don't have anything yet that's even workable for
> the moment.

Assuming you're talking about splitting helma's DB directory up into 
subfolders based on the id number of the object as we'd discussed a 
while ago. I've been wondering whether that is really an optimization 
anymore on modern Linux, and this post just showed up on 
programming.reddit.com which does some benchmarks that suggest maybe not:

http://ygingras.net/b/2007/12/too-many-files%3A-reiser-fs-vs-hashed-paths

For up to 2^20 files (as far as he went), modern ext3 is slower with the 
hashed directories than with all the files in one directory.

OTOH, the hash scheme there uses two levels of directory to get a total 
of 256 file-containing directories (./8/8/lots), which seems an 
unnecessary complication when one dir with 256 subdir (./256/lots) would 
do fine.

Just some food for thought. The only really winning strategy Gringas 
finds is to make sure you have plenty of memory and let the OS do its 
caching thing.

-Joshua


More information about the Helma-user mailing list