[ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

Martijn Faassen faassen at startifact.com
Sun Mar 25 17:27:27 EDT 2007


Hey Jim,

Jim Fulton wrote:
> 
> On Mar 25, 2007, at 12:33 PM, Martijn Faassen wrote:
[snip]
>> I have the strong suspicion that modern relational databases are 
>> currently better able to scale at queries using LIMIT and ORDER BY
>>  than the Zope 3 catalog.
> 
> I had a similar suspicion.  I assigned the Python Labs team the task
> of finding out through literature search the approaches used.  They
> found that there were none other than the sorts of things I've
> mentioned.

What about caching strategies? (as I sketched out in my last mail)

This article about MySQL claims that MySQL is the only database that 
does query result set caching. Surprising for such an obvious thought:

http://dev.mysql.com/tech-resources/articles/mysql-query-cache.html

Perhaps it doesn't work as well as one would think and that's why other 
database engines rejected it. :)

>> I cannot back this up as I haven't done measurements. Perhaps you
>> have done so?
> 
> We did a literature search.

That's useful, but doesn't tell us very much about how they compare in
practice.

Perhaps someone should do measurements and see how the two compare in a
sort/batch use case. It shouldn't be too hard to set up a relational
database-based sorted batch along with a ZODB/catalog based sorted batch
and see how they both hold up.

>> * Do you estimate the performance of the Zope 3 catalog to be 
>> equivalent to the performance of a modern relational database
>> system for queries that need to sort and batch their results?
> 
> I estimate that the same issues apply to both.

Theoretical algorithm scalability is one thing, and the same issues
apply to both. Practical scalability might vary widely.

[snip]
> I think further improvements are warrented, but they will not achieve
>  the goal that many people expect.

Okay, that brings the discussion forward.

To identify our goal it would be good if we did the above comparison 
with an RDB. We then know how much further we are able to go with the 
catalog. Not all the way to RDB performance probably, as they have an 
enormous headstart, but there may still be improvements to make. 
Obviously some of this is not easy, but an analysis of the performance 
characteristics of a search/sort/batch combination might still identify 
low hanging fruit. Or we might be surprised into the realisation there's 
no problem :)

Let's put the idea that there are silver bullets behind us; you've made 
your point. Let's instead determine how to move forward.

Regards,

Martijn



More information about the ZODB-Dev mailing list