[ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about
jim at zope.com
Mon Mar 26 09:24:41 EDT 2007
On Mar 25, 2007, at 5:27 PM, Martijn Faassen wrote:
> Hey Jim,
> Jim Fulton wrote:
>> On Mar 25, 2007, at 12:33 PM, Martijn Faassen wrote:
>>> I have the strong suspicion that modern relational databases are
>>> currently better able to scale at queries using LIMIT and ORDER BY
>>> than the Zope 3 catalog.
>> I had a similar suspicion. I assigned the Python Labs team the task
>> of finding out through literature search the approaches used. They
>> found that there were none other than the sorts of things I've
> What about caching strategies? (as I sketched out in my last mail)
Obviously, it depends a lot on access patterns. I expect that this
is an area where picking the right strategy and suceeding is highly
Take batching. Caching would potentially make getting multiple
batching go faster,. but to benefit, you'd have to increase the
internal batch size. For example, if the user visible batch size is
20 and you wanted them to be able to get the second batch without
searching and sorting, you'd have to make your internal batch size
40. That would increase the cost for the first batch by on the order
of log(2). I suspect that most people don't look at multiple
batches, so caching to support multiple batches could be a
significant loss, even leaving memory impact aside.
OTOH, we've used some highly application specific caching strategies
in some of our commercial applications to great success. These caches
were implemented as specialized indexes, and I would argue that
indexes are really a form of caching.
> This article about MySQL claims that MySQL is the only database
> that does query result set caching. Surprising for such an obvious
Sounds like BS to me. :)
> Perhaps it doesn't work as well as one would think and that's why
> other database engines rejected it. :)
I suspect it is a hard general strategy to get right.
Note that SQL methods support query caching and Zope's caching
framework is often used to cache various kinds of computations,
>>> I cannot back this up as I haven't done measurements. Perhaps you
>>> have done so?
>> We did a literature search.
> That's useful, but doesn't tell us very much about how they compare in
Actually, it does. But feel free to to dome performance tests.
> Perhaps someone should do measurements and see how the two compare
> in a
> sort/batch use case. It shouldn't be too hard to set up a relational
> database-based sorted batch along with a ZODB/catalog based sorted
> and see how they both hold up.
Yup, although, to be meaningful, you need to look at large data
sets. This raises the amount of effort required.
>>> * Do you estimate the performance of the Zope 3 catalog to be
>>> equivalent to the performance of a modern relational database
>>> system for queries that need to sort and batch their results?
>> I estimate that the same issues apply to both.
> Theoretical algorithm scalability is one thing, and the same issues
> apply to both. Practical scalability might vary widely.
OK, I give up. This argument just isn't worth my time any more. I'm
sorry I objected to the original point.
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
More information about the ZODB-Dev