[ZODB-Dev] Re: From memory Problems to Disk Problems :-(

Barry A. Warsaw barry@zope.com
Mon, 5 Nov 2001 17:10:04 -0500


>>>>> "CW" == Chris Withers <chrisw@nipltd.com> writes:

    >> something that the BDB tables involved are doing is causing CW>
    >> the cache to jump up to 536Mb :-( Which cache are you talking
    >> about?

    CW> The one on the far right of the 'swap' line in top...

That probably has a relation to Berkeley's cache, but isn't the same
thing I'm talking about.

    >> Have you read the tuning pages for BerkeleyDB?

    CW> Erk.. there are tuning pages? I should probably read them...

>From the README:

-------------------- snip snip --------------------
Tuning BerkeleyDB

    BerkeleyDB has lots of knobs you can twist to tune it for your
    application.  Getting most of these knobs at the right setting is
    an art, and will be different from system to system.  We're still
    working on recommendations with respect to the Full storage, but
    for the time being, you should at least read the following
    Sleepycat pages:

    http://www.sleepycat.com/docs/ref/am_conf/cachesize.html
    http://www.sleepycat.com/docs/ref/am_misc/tune.html
    http://www.sleepycat.com/docs/ref/transapp/tune.html
    http://www.sleepycat.com/docs/ref/transapp/throughput.html

    One thing we can safely say is that the default Berkeley cache
    size of 256KB is way too low to be useful.  Be careful setting
    this too high though, as performance will degrade if you tell
    Berkeley to consume more than the available resources.  For our
    tests, on a 256MB machine, we've found that a 128MB cache appears
    optimal.  YMM(and probably will!)V.

    As you read this, it will be helpful to know that the
    bsddb3Storage databases all use BTree access method.
-------------------- snip snip --------------------

    >> I'm not sure.  Sleepycat claims that they can handle big
    >> objects just fine.  I suppose I'd err on the side of lots of
    >> smaller commits.

    CW> That's what I thought, I'll give it a try. I've just put some
    CW> timing metrics into my script so I can play around with how
    CW> long it takes to index documents based on various permutations
    CW> of object per commit.  May you can do the same with yours and
    CW> we can compare? Is there any way to tell the absolute size of
    CW> a transaction as you commit? That would be interesting to
    CW> compare...

It's difficult, but I think if you calculate pickle size sums, you'll
get a close enough ballpark figure.  The overhead of keys and other
metadata on each transaction ought to be small compared to the pickle
data.

-Barry