[ZODB-Dev] what appears inside zodb storage?

Guido van Rossum guido@python.org
Wed, 18 Dec 2002 21:28:29 -0500


> I wonder whether someone can verify my current understanding of what appears
> inside a zodb storage like file storage.  (If other storages have contents
> that differ substantially, this would be good to know in some detail.)
> 
> I understand a storage contains mainly a collection of pickled objects,
> where each object has an oid, and an index that maps oids to objects.
> In addition to this, there are transacted updates to the objects.
> 
> Maybe an updated object is updated by writing a new version entirely, and
> making the map cause the oid to refer to the new version while leaving the
> old one alone (without deleting it), so packing is needed to make storage
> smaller.

Yes.

> So my theory is that a file storage contains pickled objects and a map of
> oids to those objects, and maybe old stale versions of objects, and a
> chained linked list of transactions that allow earlier views of the world
> to be taken instead of the last one.

Pretty much.  Berkeley storage uses different data structures but
pretty much stores the same conceptual info.  (It has a way of
automatically packing, i.e. garbage collecting, revisions of objects
older than a given delay.)

> When zope btrees are used and these are stored persistently (are they
> always stored persistently?) where are the btrees stored?

Each "node" in a BTree is a separate persistent object.  If a BTree
consists of 10 nodes and only 3 of those are modified by a particular
transaction, only the pickles for those 3 nodes are written as part of
the transaction record.

> Maybe I should be reading the code to verify this model, but I was hoping
> someone on this list could correct me so when I describe this to Chandler
> folks I can get it right.

The comments at the top of FileStorage.py may shed some light.

> My motivation for asking today is a desire for versions in Chandler that
> support synchronization and replication.  A simple transaction model
> (which zodb might have, but I don't know) need only have a way to indicate
> which version of an object should apply for a given transaction.  It need
> not make it easy to consult specific versions of an object.

In general the latest revision of an object is always used -- modulo
(transactional) undo.  There's also a concept of "ZODB versions" which
is really long-term locking of selected objects; don't use this.

--Guido van Rossum (home page: http://www.python.org/~guido/)