[Zope3-dev] Florent's O-R blog entry

Martijn Faassen faassen at infrae.com
Wed Aug 24 06:27:52 EDT 2005


Hi there,

A few contributions to this interesting discussion...

[snip the Zope 3 catalog is not a hack, and clean and simple]

> The catalog and index code is not a hack, and is in fact simple,  
> effective and flexible.  Python is the query language, and the lack  of 
> an optimizer is not a reason to go running to an RDBMS index.  The  
> catalog and index code could use polish and even alternate  
> implementations, but the BTrees, the core code, are fantastic tools.

I have had some opportunity to work with the Zope 3 catalog recently, 
and I have a few comments. First of all, I agree with the main idea that 
the Zope 3 catalog is not a hack, and is clean and flexible. I believe 
the catalog should be invested in, as I think it's cool.

Now as to where I see areas where features are lacking in the Zope 3 
catalog:

Underfeatured query API
-----------------------

I do think that currently the API to query it is woefully underfeatured.

I've tried to work on this problem and am sitting on some code that just 
needs a bit of time to polish and release that allows a simple query 
language on top of the catalog. It's just building up a tree of python 
objects for queries, nothing special, but it is a lot higher level than 
what's already there.

Fast, easy batching/sorting
---------------------------

I don't know how to do easy, efficient batching/sorting with the 
catalog. I'd like to be able to query *just* a batch of objects, sorted, 
for user interface purposes. There doesn't seem to be a straightforward 
way to do this yet, and this is a very common use case. The batching 
implementation sitting out there in zope.bugtracker.batching is nice, 
but doesn't deal with the catalog.

I think this should be fixable with a bit more infrastructure though. 
Getting the right batch is just a query on an index, and the result can 
be sorted afterwards, though there are tricky issues getting the right 
batch *size*.

Missing powerful query concepts
-------------------------------

Certain powerful query concepts like joins, available in a relational 
setting, are missing. I've already run into a scenario where I wanted to 
someting like this: given a bunch of version objects with field 'id', 
where multiple objects can have the same 'id' to indicate they're 
versions of the same object, I want all objects where field 
'workflow_state' is 'PUBLISHED' unless there is another object with the 
same id that have workflow_state 'NEW', in which case I want that one'.

I think joins would be a way to solve it, though I haven't figured out 
the details, nor how to implement them efficiently on top of the 
catalog. This kind of thing is where a relational database makes life a 
lot simpler.

Zope catalog benefits
=====================

Now as to benefits of using the ZODB instead of a relational database. 
I've seen some mentioned already, and I think there are more that 
haven't been mentioned yet. I realize that some of these issues don't 
exist so much with 'transparent' maps like Ape, which acts like the 
ZODB, though *if* a relational database is used by an application I also 
think that those features will be used (otherwise, why do it?), which 
will still reduce the portability to non-Ape settings.

Common development platform
---------------------------

I've seen it mentioned elsewhere in this thread that the ZODB can unify 
the development community, whereas O/R mapping technologies (in 
particular those not transparent to the ZODB) run the risk of scattering 
it. I think this is an interesting argument so I'd just like to 
underline it.

Ease of deployment
------------------

Right now a Zope application is relatively difficult to deploy compared 
to some other solutions like PHP, but, it's probably easier to deploy 
than other solutions which require a relational database backend. Now it 
might seem that 'enterprise' deployments are big anyway, so we don't 
have to worry about making this harder, but:

   * enterprises will ask questions like "which relational database does 
it support? we standardized on relational database system foo, does it 
work with that?" We run the risk of having to say "no", or, if "yes", we 
may run the risk of "oops, we cannot test this easily with database 
system foo as we don't have it here."

   * requiring a RDB for deployment makes it harder to market our 
software, as it's harder to just download and install software into your 
Zope to try it out. You need to set up a relational database as well. I 
may be mistaken, but think Plone would be less popular if a relational 
database would be *necessary* in order to play with it.

   * closely related, requiring a RDB for deployment makes it harder to 
market our open source software to other developers. This ties in 
closely to the argument above involving the risk of the community 
fracturing. Even inside a company having more software requirements like 
a RDB may hinder team development (where each team member runs a 
separate instance of the software), as there's simply that much more to 
set up and thus harder for someone to get up to speed with a project.

Testability
-----------

One point I also haven't seen mentioned yet is that I don't want to have 
to have a relational database installed in order to run my tests. The 
great thing about the ZODB and persistence is that it's very 
transparent. Persistent instances are very very similar to 
non-persistent ones.

[snip blob support argument]

I agree that the blob argument for RDB mapping is not convincing. There 
are other solutions around and this is being improved rapidly.

Anyway, all of the arguments against object/relational mapping aside, I 
do think this is an interesting area to explore. You *do* get a whole 
lot of power using a relational database, after all. I myself am 
actually in two minds concerning very transparent ZODB-style solutions 
like Ape or less transparent but more explicit uses of O/R mappings like 
SQLObject. While the transparency has many benefits mentioned before, 
the more straightforward mapping has the benefits of simplicity, may map 
to relational databases more easily, and may expose powerful relational 
features more straightforwardly.

Regards,

Martijn


More information about the Zope3-dev mailing list