[ZODB-Dev] Updated BTree docs

Casey Duncan casey at zope.com
Fri May 2 12:20:41 EDT 2003


Sorry, I forgot to mention that it takes a single argument, which is a minimum 
value. It filters out any values less then this. I don't know what the use 
case is for this feature, ZCatalog does not use it. If I were to guess I 
would probably say it is some sore of score threshold for eliminating results 
below a certain score.

On Friday 02 May 2003 10:54 am, Tim Peters wrote:
> Casey Duncan]
> > I think I can elaborate on the following passage in the docs:
> >
> >    "... and byValue(), which should probably be ignored (it's hard to
> >     explain exactly what it does, and as a result it's almost never
> >     used - best  to consider it deprecated). "
> >
> > byValue() returns (value, key) pairs in sorted order by value.
> 
> If that were true, I wouldn't have a problem explaining what it does <wink>.
> Examples to ponder:
[snip]
> So this is really some combination of sorting, filtering, and type-dependent
> value arithmetic, all rolled into one.  byValue() isn't called in the Zope3
> codebase so far, and I see one use in the Zope2 codebase (in Catalog.py).
> I've never seen it called with an argument other than 0 in real life (in
> which specific case, and if no value is less than 0, it's easy to explain
> what it does).

I didn't say it made sense ;^)
 
> > ZCatalog uses this to sort "scored" results, such as from text indexes,
> > which start as a mapping of rid->score.
> 
> I'd like a method that did only that much a lot better.  Note that in
> ZCTextIndex we didn't sort the whole thing, instead we used an N-best
> priority queue to remember just the best N scoring items.  Even running at
> Python speed, and using a dirt-dumb list for the queue, this was usually
> much faster than sorting the whole result sequence (at C speed) first (&
> that's generally true if N is much less than the # of items in the whole
> result sequence).

We could have a keysForBestValues(N) method that did this, I dunno. I actually 
put the N-best sorting algorithm in ZCatalog for 2.6.1, ironically though it 
is never used for TextIndexes... I had planned to change that in 2.7.
 
> > I have actually been camping on some optimized code for this. The
> > current implementation is pretty lame. I came up with a new API,
> > keysByValue(), which  returns the keys in order by value, which is
> > really all ZCatalog needs. The implementation I have for *IBTree
> > variants is 10x faster than the existing byValue implementation.
> >
> > I should probably get with you and/or Jim and discuss
> > generalizing this and integrating it into the BTrees module.
> > Basically I need to make it work for *OBTree variants and tangle
> > it up with the macros in there.
> 
> Yup, more macros is exactly what BTrees need <wink>.  If you lose your one
> use for the existing byValue() method then, I'd like to deprecate it for
> real, as I don't know of any other uses, it's not even tested, and is hard
> to explain.

Yup, I agree byValue should be deprecated in this case. Its a pretty weird 
method.

-Casey



More information about the ZODB-Dev mailing list