[Helma-user] CouchDB beginnings
Maksim Lin for technical support mailling lists
maksim_lin at ngv.vic.gov.au
Mon Dec 31 00:01:46 CET 2007
Hi Joshua,
Thanks for the pointer! - that really is an excellent article.
Actually even though I deploy exclusively for production on linux, I do
a lot of dev and some testing on windows boxes and it was actually
windows/ntfs poor performance with large numbers of files per folder
that had me worried and I never even checked what performance was like
in linux!
And I also only did some very simple tests by creating a large number of
files in a folder and then trying to just look at the folder with
Explorer, but that's a pretty typically use-case on windows systems and
the results were pretty bad.
However I don't have any real test data, so I will try running some
simple benchmarks as in that article and see what the results are, but I
guess it could be the ntfs is ok too and its really only the actual
explorer app that has difficulties with large file numbers per folder.
Anyways I won't speculate any further without getting some data first
:-)
Maks.
> -----Original Message-----
> From: helma-user-bounces at helma.org
> [mailto:helma-user-bounces at helma.org] On Behalf Of Joshua Paine
> Sent: Friday, 28 December 2007 08:31
> To: Helma User Mailing List
> Subject: Re: [Helma-user] CouchDB beginnings
>
> Maksim Lin for technical support mailling lists wrote:
> > In the past its been suggested using separate subfolders
> for helma DBs
> > to help with scaling to lare numbers of objects and I've
> been toying
> > with a few ideas but I don't have anything yet that's even workable
> > for the moment.
>
> Assuming you're talking about splitting helma's DB directory
> up into subfolders based on the id number of the object as
> we'd discussed a while ago. I've been wondering whether that
> is really an optimization anymore on modern Linux, and this
> post just showed up on programming.reddit.com which does some
> benchmarks that suggest maybe not:
>
> http://ygingras.net/b/2007/12/too-many-files%3A-reiser-fs-vs-h
> ashed-paths
>
> For up to 2^20 files (as far as he went), modern ext3 is
> slower with the hashed directories than with all the files in
> one directory.
>
> OTOH, the hash scheme there uses two levels of directory to
> get a total of 256 file-containing directories (./8/8/lots),
> which seems an unnecessary complication when one dir with 256
> subdir (./256/lots) would do fine.
>
> Just some food for thought. The only really winning strategy
> Gringas finds is to make sure you have plenty of memory and
> let the OS do its caching thing.
>
> -Joshua
> _______________________________________________
> Helma-user mailing list
> Helma-user at helma.org
> http://helma.org/mailman/listinfo/helma-user
>
>
>
More information about the Helma-user
mailing list