[Helma-user] CouchDB beginnings

Maksim Lin for technical support mailling lists maksim_lin at ngv.vic.gov.au
Mon Dec 31 00:01:46 CET 2007


Hi Joshua,

Thanks for the pointer! - that really is an excellent article.

Actually even though I deploy exclusively for production on linux, I do
a lot of dev and some testing on windows boxes and it was actually
windows/ntfs poor performance with large numbers of files per folder
that had me worried and I never even checked what performance was like
in linux!
And I also only did some very simple tests by creating a large number of
files in a folder and then trying to just look at the folder with
Explorer, but that's a pretty typically use-case on windows systems and
the results were pretty bad. 
However I don't have any real test data, so I will try running some
simple benchmarks as in that article and see what the results are, but I
guess it could be the ntfs is ok too and its really only the actual
explorer app that has difficulties with large file numbers per folder.
Anyways I won't speculate any further without getting some data first
:-)

Maks.

> -----Original Message-----
> From: helma-user-bounces at helma.org 
> [mailto:helma-user-bounces at helma.org] On Behalf Of Joshua Paine
> Sent: Friday, 28 December 2007 08:31
> To: Helma User Mailing List
> Subject: Re: [Helma-user] CouchDB beginnings
> 
> Maksim Lin for technical support mailling lists wrote:
> > In the past its been suggested using separate subfolders 
> for helma DBs 
> > to help with scaling to lare numbers of objects and I've 
> been toying 
> > with a few ideas but I don't have anything yet that's even workable 
> > for the moment.
> 
> Assuming you're talking about splitting helma's DB directory 
> up into subfolders based on the id number of the object as 
> we'd discussed a while ago. I've been wondering whether that 
> is really an optimization anymore on modern Linux, and this 
> post just showed up on programming.reddit.com which does some 
> benchmarks that suggest maybe not:
> 
> http://ygingras.net/b/2007/12/too-many-files%3A-reiser-fs-vs-h
> ashed-paths
> 
> For up to 2^20 files (as far as he went), modern ext3 is 
> slower with the hashed directories than with all the files in 
> one directory.
> 
> OTOH, the hash scheme there uses two levels of directory to 
> get a total of 256 file-containing directories (./8/8/lots), 
> which seems an unnecessary complication when one dir with 256 
> subdir (./256/lots) would do fine.
> 
> Just some food for thought. The only really winning strategy 
> Gringas finds is to make sure you have plenty of memory and 
> let the OS do its caching thing.
> 
> -Joshua
> _______________________________________________
> Helma-user mailing list
> Helma-user at helma.org
> http://helma.org/mailman/listinfo/helma-user
> 
> 
> 


More information about the Helma-user mailing list