[ZODB-Dev] how big can a ZODB get?

Jeremy Hylton jhylton at gmail.com
Wed Jan 12 13:17:24 EST 2005


On Wed, 12 Jan 2005 12:27:12 -0500, Victor Ng <crankycoder at gmail.com> wrote:
> Ok - so documentation is sparse, is there a set classes in the ZODB
> that I should look at for tuning?  I've generated an epydoc for ZODB
> 3.3 to help me poke through the code as a starting guide for myself.

Here are some quick thoughts about tuning:

An application uses one or more database connections
(ZODB.Connection).  Each connection has its own in-memory cache of
objects.  The cache size is measured in objects, even though the
amount of memory used by individual objects varies greatly.  The cache
also keeps "ghost" objects, which uses on the order of 100 bytes of
memory; it keeps a ghost for every object that is directly (one-hop)
reachable from cached objects.

Since an application may have multiple database connections, it may
often have multiple copies of objects in memory.

You have rough control over cache memory by changing the number of
objects it holds.  There are also some methods to flush or minimze the
cache -- evicting objects even if the cache has room for more objects.

The way you implement objects has an obvious impact on memory usage. 
ZODB loads a first-class persistent object as an atomic unit; it also
loads all reachable second-class persistent objects as the same time. 
If you load a large object graph, because a 1st class object refers to
many 2nd class instances, you use a lot of memory when it is loaded. 
(On the other hand, you effectively batch load many objects.)  If you
use more 1st class objects and/or keep individual objects small, you
may get more efficient memory usage; you don't get a few large objects
in the cache that use a lot of memory.  On the other hand, there's
more memory overhead for a 1st class persistent object.

The storage implementation also needs memory.  For example,
FileStorage maintains an in-memory index with an entry for each oid in
the storage.  I don't recall exactly how much memory is used per
object.  A very large storage will need a fair amount of memory for
the storage.  ZEO is handy here, because the the storage runs in the
ZEO server process (potentially on a different machine).

Jeremy


More information about the ZODB-Dev mailing list