[Helma-user] CouchDB beginnings
Joshua Paine
joshua at papercrown.org
Thu Dec 27 22:30:34 CET 2007
Maksim Lin for technical support mailling lists wrote:
> In the past its been suggested using separate subfolders for helma DBs
> to help with scaling to lare numbers of objects and I've been toying
> with a few ideas but I don't have anything yet that's even workable for
> the moment.
Assuming you're talking about splitting helma's DB directory up into
subfolders based on the id number of the object as we'd discussed a
while ago. I've been wondering whether that is really an optimization
anymore on modern Linux, and this post just showed up on
programming.reddit.com which does some benchmarks that suggest maybe not:
http://ygingras.net/b/2007/12/too-many-files%3A-reiser-fs-vs-hashed-paths
For up to 2^20 files (as far as he went), modern ext3 is slower with the
hashed directories than with all the files in one directory.
OTOH, the hash scheme there uses two levels of directory to get a total
of 256 file-containing directories (./8/8/lots), which seems an
unnecessary complication when one dir with 256 subdir (./256/lots) would
do fine.
Just some food for thought. The only really winning strategy Gringas
finds is to make sure you have plenty of memory and let the OS do its
caching thing.
-Joshua
More information about the Helma-user
mailing list