[ZODB-Dev] Re: Community opinion about search+filter

Ross Patterson me at rpatterson.net
Thu Mar 15 17:25:23 EDT 2007

Martijn Faassen <faassen at startifact.com> writes:

> I think one of the main limitations of the current catalog (and
> hurry.query) is efficient support for sorting and batching the query
> results. The Zope 3 catalog returns all matching results, which can
> then be sorted and batched. This will stop being scalable for large
> collections. A relational database is able to do this internally, and
> is potentially able to use optimizations there.
> It would be very nice if someone could look into expanding hurry.query
> and/or the catalog to support these cases. It would be interesting to
> look at what Dieter Maurer has done with AdvancedQuery in Zope 2 in
> this regard as well.

I recently became obsessed with this problem and sketched out an
architecture for presorted indexes.  I thought I'd take this
opportunity to get some review of what I came to.

>From my draft initial README:

> Presort provides intids which assure the corresponding documents will
> be presorted in any BTrees objects where the intid is used as a key.

> Presorted intids exist alongside normal intids.  Intids are
> distributed over the range of integers available on a platform so as
> to avoid moving presorted intids whenever possible, but eventually a
> given presorted intid may need to be placed in between two other
> consecutive presorted intids.  When this happens, one or more
> presorted intids will have to be moved.  Normal intids are unchanging
> as normal.

> Presort also provides a facility for updating objects who store
> presorted intids when a presorted intid is moved, such as in indexes.
> It also provides for indexes that map a given query to the appropriate
> presorted intid result set and for catalogs that use the appropriate
> presorted intid utility to lookup the real objects for the results.

Would this be a viable approach?  Would it be generally useful?

Also, the problem of distributing intids so that they are moved as
seldom as possible is a bit of a challenge.  I'm sure there's some
algorithm for such problems that I'm just unaware of.  Does anyone
have any pointers for this?


More information about the ZODB-Dev mailing list