[ZODB-Dev] Indexing objects in a ZODB

Monty Taylor mordred@inaugust.com
27 Feb 2001 11:45:19 +0100


Greg Ward <gward@cnri.reston.va.us> writes:

> I'm trying to figure out how to build an index of objects in a ZODB
> database.  Clearly there's some code for supporting this -- I've poked
> through the SearchIndex and Catalog packages (from the SourceForge CVS of
> Andrew's ZODB/ZEO release), but I couldn't really figure out what's going on
> there.

I haven't used these outside of Zope, but I do use them from Python
within Zope, so I'm going to take a shot at this - please someone
shoot me down if I spout untruths. 
 
> Second question: I propose rolling my own index because I don't understand
> the tools provided by ZODB (Catalog, SearchIndex).  Can I use those tools
> for this sort of situation?  Is there an explanation somewhere of how to use
> them?  (I've tried RTFS'ing, and it didn't help much.)

The concept of the Catalog objects is this: Create a Catalog in your
root. You then create fields and indexes (I believe there are methods
to do this, although the helper methods are Zope, IIRC) 

Fields are fields of actual meta-data that you want to store directly
in the Catalog. Otherwise, the Catalog stores references to the
objects it Catalogs, and you have to get the object. So if you know
you're going to do a Catalog-search on the 'name' attribute, then add
'name' as a meta-data field and you should be able to get the name
attribute of the Catalog search result, instead of needing the whole
object.  

Indexes are the things you're going to search on. So if you add a
'name' index, and you catalog the object 'Foo', and Foo has a name
property (or method, BTW), the catalog will index that. There are a
couple of different types of indexes, depending on what you want. You
can have TextIndexes, FieldIndexes and KeywordIndexes. IIRC, TextIndex
gives you full-text searching on a field, Field gives you exact
matching, and Keyword works on lists of words. 

Once you have a Catalog object, and have created a schema in it, you
of course have to register things with the Catalog. This is the bit I
haven't done directly, (I'm inheriting from a Zope class called
CatalogAwareness) but the basic gist of things is that you call the
Catalog.catalog_object(object_to_catalog) and it registers the
contents of any properties/methods that object_to_catalog has that
match columns in the Catalog. 

Let me know if this helps at all, or if I've gotten anything horribly
wrong. I'd suggest playing with the Catalog. It's a bit odd at first,
but I think certainly worth the time and effort once you get the hang
of it. 

Monty Taylor
In August Productions, Inc.