[ZODB-Dev] Re: Community opinion about search+filter
me at rpatterson.net
Thu Mar 15 17:25:23 EDT 2007
Martijn Faassen <faassen at startifact.com> writes:
> I think one of the main limitations of the current catalog (and
> hurry.query) is efficient support for sorting and batching the query
> results. The Zope 3 catalog returns all matching results, which can
> then be sorted and batched. This will stop being scalable for large
> collections. A relational database is able to do this internally, and
> is potentially able to use optimizations there.
> It would be very nice if someone could look into expanding hurry.query
> and/or the catalog to support these cases. It would be interesting to
> look at what Dieter Maurer has done with AdvancedQuery in Zope 2 in
> this regard as well.
I recently became obsessed with this problem and sketched out an
architecture for presorted indexes. I thought I'd take this
opportunity to get some review of what I came to.
>From my draft initial README:
> Presort provides intids which assure the corresponding documents will
> be presorted in any BTrees objects where the intid is used as a key.
> Presorted intids exist alongside normal intids. Intids are
> distributed over the range of integers available on a platform so as
> to avoid moving presorted intids whenever possible, but eventually a
> given presorted intid may need to be placed in between two other
> consecutive presorted intids. When this happens, one or more
> presorted intids will have to be moved. Normal intids are unchanging
> as normal.
> Presort also provides a facility for updating objects who store
> presorted intids when a presorted intid is moved, such as in indexes.
> It also provides for indexes that map a given query to the appropriate
> presorted intid result set and for catalogs that use the appropriate
> presorted intid utility to lookup the real objects for the results.
Would this be a viable approach? Would it be generally useful?
Also, the problem of distributing intids so that they are moved as
seldom as possible is a bit of a challenge. I'm sure there's some
algorithm for such problems that I'm just unaware of. Does anyone
have any pointers for this?
More information about the ZODB-Dev