[Zope-dev] NOT in Field and Keyword Indexes

Johan Carlsson [Torped] johanc@torped.se
Sat, 06 Jul 2002 11:04:37 +0200


Hi,
I'm messing around with a Plugable Index that can take an NOT operator.
(regular field/keyword indexes can just do AND or OR (and of course range).

I intend to release my code on zope.org, but first I though I'd write some=
=20
unit tests.

EasyKeywordIndex (as it's currently called) extends keywordIndex and adds
three operators 'not', 'notor', 'notand'. ('not' =3D=3D 'notor')

It works by first doing range (i.a.), then doing AND/OR and finally (an the=
=20
only new stuff)
diff it agains all indexed documents.

To be able to make the diff at the end I have added a index attribute=20
self._all that is
a IISet of all indexed documentIds.

Now to the question(s): (I allready answered one of the writting this=20
e-mail ;-)

If a document doesn't have the attribute it's index it still is included in=
=20
the index
(as an unindex entry).
Why does it work this way, I would have thought that if a document doesn't
have the indexing attribute it should be not be included in the index?
This would free up allot of references to documents that will never be=20
search through the
index, at least in an application where the ZCatalog has allot of indexes=20
that index
allot of classes with quite different types of meta-data.

For the NOT-aware index it would mean a great different.
Instead of returning all documentIds that ever been through the ZCatalog
that doesn't match, it would return a much smaller result list.
Of course the not index will most probably be combined with other
query operators (such as a field of meta_data in my case) but returning
a smaller result list would make not indexes more effective, right?

I love to get this accepted as a patch for the unindex, I might also have
a look at TextIndex and PathIndex, while I'm at it :-)

(
   TextIndexes does partially do NOT with the 'andnot', 'ornot' opertators,=
 but
   they require that the first search argument is not not (e.g. "first AND=
=20
NOT second",
   not "NOT first AND second")
)

Best Regards,
Johan Carlsson



--=20
Torped Strategi och Kommunikation AB
Johan Carlsson
johanc@torped.se

Mail:
Birkagatan 9
SE-113 36  Stockholm
Sweden

Visit:
V=E4stmannagatan 67, Stockholm, Sweden

Phone +46-(0)8-32 31 23
Fax +46-(0)8-32 31 83
Mobil +46-(0)70-558 25 24
http://www.torped.se
http://www.easypublisher.com