[ZODB-Dev] IIBTree.multiunion and list comprehensions

Casey Duncan casey at zope.com
Tue Dec 9 23:37:03 EST 2003


On Tue, 9 Dec 2003 17:43:51 -0200
Christian Robottom Reis <kiko at async.com.br> wrote:

> 
> [I've been messing around with optimizations to query results in
> IndexedCatalog, and I've run into a result that's puzzled me a bit, but
> since I'm now over it, I'm sharing here.]
> 
> ISTM that, when accessing a persisted BTree that holds Sets as
> its values, using multiunion (I've tested in particular with IIBTree's
> version) has no significant speed advantage over using a list
> comprehension to glue the sets together. IOW, assuming tree is a
> yet-to-be-unpersisted BTree with a large number of items stored,
> 
>     multiunion(tree.keys())

The *keys* of the tree are sets? Huh?
 
> takes about the same time to run as
> 
>     [value for set in tree.keys() for value in set] (*)

So you create a set out of this get the unique members?
 
> My hypothesis is that even though multiunion() may run a bit faster,
> we're bound by the retrieval of buckets from the persistence mechanism,
> which is [comparatively] slow.

Sure, but assuming you have a big enough cache, the multiunion should beat it hands down (once its warmed up).
 
> Note that multiunion returns an IISet(), while the latter returns a
> list; converting between is pretty cheap given the context.

Note that creating IITreeSets is often fastest because it allocates new buckets as it grows rather than resizing a single memory block. An IISet is essentially just a sorted C array of unique ints.

-Casey



More information about the ZODB-Dev mailing list