[ZODB-Dev] RFE: Spec for ZODB Indexing

Tue, 17 Sep 2002 22:07:11 -0300

Finally answering, now that we are closer to a next release of
IndexedCatalog.

On Fri, Jun 07, 2002 at 11:10:33AM +0200, Thomas Förster wrote:
> > 3. Implementation
> > ....
> >     A concept for instance aggregation has to be provided too. This
> >     aggregator would hold a collection or instance references and the
> >     indexes, and would provide both query interface and index
> >     maintenence functionality. This aggregator would also be a ZODB
> >     persistent object.
> 
> I don't get the point here. I thought this is the reason for querying a 
> database, to aggregate instances based on constraints.

An SQL database has the table as its basic aggregator. What do we use
when we are programming in an OO model? We've designed IC around
Catalogs, which are roughly equivalent to tables and provide both
instance creation and query mechanisms.

> > 4. API example
> >
> >         class Catalog(Persistent):
> >             XXX: Entity?
> >             def __init__(self, XXX):
> >                 XXX: define what kind of instance it stores
> >                 XXX: instance meta-data acquisition?
> >                 XXX: index auto-creation?
> >             def dump(self):
> >                 XXX: return list of instances
> >             def query(self, q):
> >                 XXX: process query
> >             def init_index(self, instance_attr):
> >                 XXX: specify only indexes you want to avoid bloat?
> 
> This should be provided by the data base. I don't want to implement these 
> functions for every class seperately.  I don't want to do more than giving 
> metadata and calling a base class __init__ explicitly, like:

Yes, this is precisely what Catalog offers. This was an API for the
aggregator, not for the persistent object. Our design is to reduce to
the utmost the burden on the application programmer.

> > 5. Development steps
> >
> >     - Stabilize requirements for conditionals, summaries and joins
> >     - Think up Catalog API
> >     - Design query language
> 
> => already done, just take spec and implement an OQL interface for ZODB, 
> making it more ODMG 'compliant'.

A full OQL-compliant parser is quite a lot of work to implement, but if
you are willing to help, we'd be happy to work with you. The current
parser is quite simple, but allows simple queries to work. No
aggregation or function calls at the moment, but these can follow. And
everything is quite separate from the parser, so when a new parser comes
in, we just plug up the query functions.

> def query(querystring) #performs OQL query on data base
> def entry(objectName) # returns named object

What do you mean by "named" object? How could this be specified?

> def extent(className) #returns collection of all instances of given class and
> 	subclasses

Catalog.dump(), more or less, but without subclass support. Or, in other
words: if you place in a catalog instances of type X, you can be sure
that dump will return 100% X instances. Is it important to be able to
add to the catalog subtypes as well?

> I think these are the crucial points here. Having implemented indexes in ZODB 
> is a good thing  in any case, be there a query language or not.

Well, without a query language, what help are indexes? :-)

We still have some questions about composite objects and the
initialization lifecycle, and looking for more feedback. We'll be
sending the announcement to the mailing list tomorrow, please feel free
to comment if you may.

Take care,
--
Christian Reis, Senior Engineer, Async Open Source, Brazil.
http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL