[ZODB-Dev] ZODB idioms

Tim Peters tim@zope.com
Mon, 24 Jun 2002 12:18:53 -0400


[Jeremy Hylton]
> ...
> The particular problem of sequential keys is unfortunate, but hardly
> catastrophic.  The BTree uses twice as much space as it would under
> ideal conditions.

Does it really?  People should think about this for real <wink>.  Classic
BTrees have nodes all of exactly the same size in memory, and on disk.
Zope's do not.  The space for keys and/or values is in separately malloc'ed
chunks of memory, and grows only as big as needed in each node (well,
there's some (I suspect) too-aggressive over-allocation, but let's ignore
that).  On disk, the pickle only contains space for key and/or value slots
that are actually used (with no over-allocation).

So if you take two versions of the same tree, one with buckets 50% full, and
the other with buckets 100% full, the amount of memory devoted to holding
keys and values is approximately the same, and the amount devoted to that in
pickles is exactly the same.  What differs is the number of bucket and BTree
nodes.  The II flavor of trees have bucket capacities of 120 key/value
pairs, so it's the difference between N/120 and N/60 bucket nodes.  It's
about 1% either way <wink>, and the one with the smaller buckets may well be
*less* likely to lead to conflicts (picture the extreme:  if the tree
stuffed everything into just one bucket, any pair of chnages would
conflict).

If we have a model for balancing these tradeoffs, I haven't found it.