[Zope-dev] What catalog/index to use ...

Casey Duncan casey@zope.com
Fri, 8 Nov 2002 10:09:19 -0500


In the original design of ZCTextIndex we (PythonLabs mostly) considered=20
stemming and found that it has been found to have dubious value in many=20
information theorists views (The fact that Google does no stemming was al=
so a=20
factor in the decision). So we decided to leave it out entirely.

ZCTextIndex is extensible and third parties can add additional text proce=
ssing=20
facilites (called pipeline elements) to the system without modifying=20
ZCTextIndex. This could be a way to add stemming and any other conceivabl=
e=20
feature involving preprocessing the index source and query text.

Granted that feature could use better(!) documentation... (I should just =
add=20
that to my email sig ;^)

-Casey

On Friday 08 November 2002 08:19 am, Jens Vagelpohl wrote:
> > Depends on your needs. ZCTextIndex is very easy to use and supports=20
> > relevance
> > ranking, TextIndexNG is supposed to be some kind of=20
> > eier-legende-wollmilch-sau.
> > Compare the features and make your choice.
> >
> > -aj
> >
>=20
> isn't TextIndexNG much better with international character encodings=20
> and that stuff? and it has a lot more stemmers for various languages.
>=20
> jens