[ZODB-Dev] ZODB idioms

Toby Dickenson tdickenson@geminidataloggers.com
Mon, 24 Jun 2002 16:16:39 +0100


On Monday 24 Jun 2002 3:33 pm, Jeremy Hylton wrote:
> >>>>> "CD" =3D=3D Casey Duncan <casey@zope.com> writes:
>
>   CD> Storing this many objects in a BTree should be fine, they will
>   CD> rebalance themselves AFAIK. Insertion of more random ids might
>   CD> tend to be faster though as the BTree grows since rebalancing
>   CD> will happen much less often.  How big an affect that would have
>   CD> I am not sure. You might want to do some timing experiments with
>   CD> large BTrees.
>
> BTrees do not rebalance themselves.  I suspect that the use of random
> ids in Zope is to avoid balancing problems by random insertions.  In
> the case of sequential ids, each BTree bucket will be half full.  When
> the bucket reaches its limit, it will be split into two buckets of
> equal size (call them left and right).  Since the ids are sequential
> the left bucket will never grow and all the new ids will be put in the
> right bucket.

Eeeeek. There are a number of cases in Zope where I suspect this could be=
=20
catastrophic.

For example, ZCatalog creates a standard index for every object's last=20
modified time. The reverse index OIBTree is keyed on last modified time, =
and=20
is therefore likely to be grossly unbalanced.