[ZODB-Dev] undo and zodb

Barry A. Warsaw barry@zope.com
Tue, 3 Sep 2002 11:20:45 -0400


>>>>> "TD" == Toby Dickenson <tdickenson@geminidataloggers.com> writes:

    >> I've been hacking on Autopack, a

    TD> wooohoo! I wasnt aware of that.

And hopefully finishing it up this week, while running a bunch of
performance stress tests on a beefy machine.

    >> very simple and minimal Berkeley storage with no undo or
    >> versions.  It would still store some number of object revisions
    >> for a short period of time, primarily for performance reasons.

    TD> Yes. berkeley.Packless performs all of its magic inside the
    TD> transaction _finish, which unnecessarily increases transaction
    TD> commit time. Its better to perform this housekeeping
    TD> asynchronously.

The other advantage for Autopack is to hopefully minimize the chance
that you'll run out of locks.  BDB has a static lock table (which can
be resized before you open the environment, but not afterward), and
allocates locks on a per-page basis.  For BTrees+transactions, you get
one lock per level in the database, plus one lock per page containing
an object you're touching.  Since ZODB transactions are theoretically
unlimited in the number of objects it can affect, you can always run
out of locks.  You can mitigate this by cranking up the static lock
table, but that costs resources.  Autopack (and some of the recent
Full work) tries to take a different approach by writing as much data
as possible optimistically, during the store() calls, which are
bounded.  The problem is cleaning up afterward. :)

    >> It has an autopack()
    >> method, but I'm still teasing out the semantics of that.  The
    >> idea would be to have a storage-wide setting controlling how
    >> far back transactions would be kept, although you wouldn't be
    >> able to access those older transactions through the api.

    TD> I see Autopack uses a reference count. That is efficient, but
    TD> it does mean that one small bug or database corruption can
    TD> lead to it deleting whole sections of your database. Ive no
    TD> reason to think that it currently has such a bug, but I am
    TD> hoping to aim for a more fault-tolerant solution for
    TD> DirectoryStorage.

Note that the reference counting stuff isn't hooked up yet.  But that
is an issue.

    >> Of an on, we've talked about per-object controls, but we've
    >> never gotten very far.

    TD> A problem is that the storage layer only gets to deal with
    TD> pickles, rather than objects. That makes it hard to set
    TD> per-object storage-level properties.

Right, this would be a ZODB4 thing.

    TD> An easy option might be to set these controls per-class
    TD> (rather than per-object). The storage layer can easily extract
    TD> the class name from the pickle. Are there any cases where that
    TD> is not adequate?

I don't know.  I'd like to see what Jim thinks about that.
-Barry