[ZODB-Dev] RFE: Spec for ZODB Indexing

Christian Reis kiko@async.com.br
Wed, 18 Sep 2002 11:38:40 -0300


On Wed, Sep 18, 2002 at 09:15:55AM +0200, Thomas Förster wrote:
> But things won't go uncommented:

Thanks. Just following up for completeness sake; if/when you have the
time, I'd appreciate comments.

> > ...[Instance aggregation]...
> > An SQL database has the table as its basic aggregator. What do we use
> > when we are programming in an OO model? 
> 
> Classes (instances of classes) I suppose. Why would one bend the OO model?

Well, I'm not sure if aggregation by class is the only way to do it. Are
we never going to want to have the same types of instances grouped in
different sets? Even if we *do* have a default catalog (like a
ClassCatalog) for every instance created, it may be that, for the
application, it's not very useful.

IMHO, the aggregator can also play a role in the OO design - it's an
InstanceGrouping of sorts, and there are many of these at least in my
application - ProductCatalog, UserCollection, SaleJournal.

> > We've designed IC around
> > Catalogs, which are roughly equivalent to tables and provide both
> > instance creation and query mechanisms.
> 
> So IC is trying to provide relational access to a fully object oriented data 
> base? Why shouldn't I use a relational DB in the first place then?

Are you being sarcastic? :-)

For many reasons, of course: seamless integration to an OO language (and
Python and ZODB exemplify this perfectly), the inherent richness and
flexibility of OO, reduced code size, etc etc etc.

Having Catalogs doesn't mean I want a RDBMS!

> > ...[API example]...
> > Yes, this is precisely what Catalog offers. This was an API for the
> > aggregator, not for the persistent object. Our design is to reduce to
> > the utmost the burden on the application programmer.
> 
> I see. Wasn't that clear, I guess. OK next step in ODMG compliance would be 
> Persistence on a per instance basis, but this is clearly not a requirement 
> for a query mechanism :-)

Well, you can avoid persisting an instance by not calling commit() :) But
I see your point; that may be something that Zope.com has in mind for
the future of ZODB?

> > ...[Query language design]...
> > A full OQL-compliant parser is quite a lot of work to implement, but if
> > you are willing to help, we'd be happy to work with you. 
> 
> As I said, my time is VERY limited now. The ODMG specs contain a full BNF of 
> OQL, so that at least the design step for the language should become 
> insignificantly short. And nobody expects to have a full implementation in 
> the first place. There are a bunch of parser generators around, which are 
> completely implemented in Python, so for prototypes where query speed is of 
> lesser importance one could also save the hassles (IMO, I'm not a C 
> programmer :-) ) of writing Python interfaces to C functions (yacc/lex 
> aproach). For ZOQLMethod the kwParsing package of Aaron Withers(?) worked 
> fine. Have a look at ZOQLMethod to see it "in action".

Will do, *jabs jdahlin in the ribs*, thanks for the tip.

> binding. So this is also a chance to both made ZODB ODMG compliant and to do 
> it our way, setting the standard for Python. ZODB would then become the first 
> open source, free implementation of an ODMG OODB.

Wow, that would rock. But I'm quite sure the OQL requires a lot more
functionality than we can provide right now :)

> > The current parser is quite simple, but allows simple queries to
> > work. No aggregation or function calls at the moment, but these can
> > follow. And everything is quite separate from the parser, so when a
> > new parser comes in, we just plug up the query functions.
> 
> How do you aggregate 2 instances of 2 different classes anyway? Aggregation 
> comes from the relational paradigm and may not fit to OODB.

Well, you can aggregate in the same class, I'm sure:

    sum(product.quantity)

should be possible, at least.

> > > def query(querystring) #performs OQL query on data base
> > > def entry(objectName) # returns named object
> >
> > What do you mean by "named" object? How could this be specified?
> 
> ODMG defines 2 ways to find specific objects. First are class extents (see 
> below), the second being 'named' (in the sense of a dictionary) instances, 
> from where to navigate trough the instances to find a specific one.
> 
> a code snippet might be:
> 	
> 	factory = PersistentFactory()
> 	factory.staff = PersonList()  
> 	factory.customers = PersonList()  # that's why extent(Person) fails to find 
> only staff
> 
> 	db.nameObject(factory.staff, "Staff")
> 
> 	-- other module --
> 
> 	def isStaffMember(person):
> 		staff = db.entry("Staff")
> 		return person in staff

Using a string to mark a group; pretty interesting, even though for this
case it would probably make more sense to query the Person instances for
is_staff or the presence of a Staff subobject? Or even inheritance?

> > > def extent(className) #returns collection of all instances of given class
> > > and subclasses
> >
> > Catalog.dump(), more or less, but without subclass support. Or, in other
> > words: if you place in a catalog instances of type X, you can be sure
> > that dump will return 100% X instances. Is it important to be able to
> > add to the catalog subtypes as well?
> 
> I like the idea of storing Books(BibEntry), Journals(BibEntry), ... and 
> getting a nice formatted list of all my library treasures by a simple:
> 	
> 	library = db.extent(BibEntry)
> 
> 	for item in library:
> 		item.printNicely()

Yeah, I see your point. I'm sure this can be implemented in IC as it is
today with minor adjustments (We'll have to think about how the types
are specified, though).

Thanks for taking the time to reply.

Take care,
--
Christian Reis, Senior Engineer, Async Open Source, Brazil.
http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL