[ZODB-Dev] [ANN] IncrementalSearch for optimized BTrees intersection

Dieter Maurer dieter at handshake.de
Sat Aug 14 11:26:46 EDT 2004


As promissed, I have reimplemented BTrees "intersection"
by means of an incremental search.

The intersection of subsets "S1", "S2", ... "Sn" of "D" requires time
proportional to
	     
	  n * min(size(Si)) * log(size(D))

(Note: the implemented algorithm does not garantee this limit
(but often is considerably faster); a small change could fix this).
    
This tends to be much faster than the native BTrees intersection
when one of the sets is much smaller than the largest input set.

When sets are intersected that are similar but sufficiently
unequal, this algorithm
can be less efficient than the native implementation by a logarithmic
factor. The worst case runtime behaviour results for
intersections similar to

    intersection(range(0,n), range(0,n,2)).

Due to a special optimization, intersections of almost equal sets
require linear time.


This is a preview release for curious people.
Currently, only the "IAnd" (Incremental And) is implemented.
"IOr" (Incremental Or) and "INot" (Incremental Not) will
soon follow. Thereafter, I will change "ManagableIndex" and
"AdvancedQuery" to use the new search engine.


Download:

  <http://www.dieter.handshake.de/pyprojects/zope#IncrementalSearch>


-- 
Dieter


More information about the ZODB-Dev mailing list