[Zope] relevance ranking in ZCTextIndex or equivalent

Miles Waller miles at jamkit.com
Wed May 31 10:59:51 EDT 2006


Hi,

I'm planning to implement a text search where

(match against the title)
  ranks more highly than
(match in the description)
  ranks more highly than
(matches against the body text).

Titles and descriptions are short bits of text, so results in these
categories can be ranked just by the frequency that the word appears in
that part of the text.  Matches against the body text should ideally be
ranked more like ZCTextIndex rather than plain frequency.

My ideas are:

- do three separate searches, and then concatenate the result sets
together.
problem: making sure there are no duplicates in the list without parsing
all the results in their entirety.

- hijack the 'scoring' part of the index, so those results with matches
in the title can have their scores artificially heightened to achieve
the ordering i want
problem: it's compleletely opaque without a lot of study whether this
would achieve what i want.  i'd also need to index the items so the
index knew what was in the title, which could be a problem.

- index title, description and text separately, and then use dieter's
AdvancedQuery product to do the query and combine results
problem: is it possible to get at the scores when the documents are
returned from the index to be able to order them?  are the scores
returned separately, or will each query overwrite the last one?

Has anyone ever tried to do this - or got any pointers - at all?

Thanks in advance,

Miles




More information about the Zope mailing list