[Zope-dev] Re: Caching ZCatalog results

Fri Feb 23 13:57:37 EST 2007

Tres Seaver wrote:
> Tres Seaver wrote:
>> Roché Compaan wrote:
>>>> On Fri, 2007-02-23 at 06:55 -0500, Tres Seaver wrote:
>>>>> Roché Compaan wrote:
>>>>>> I'm curious, has anybody played around with the idea of caching ZCatalog
>>>>>> results and if I submitted a patch to do this would it be excepted?
>>>>>>
>>>>>> I quickly coded some basic caching of results on a volatile attribute
>>>>>> and I was really surprised with the amount of cache hits I got
>>>>>> (especially with a Plone site that is a heavy user of the catalog)
>>>>> +1.  I think using the 'ZCachable' stuff (e.g., adding a RAMCacheManager
>>>>> and associating a catalog to it) would be the sanest path here.
>>>> Cool idea. I haven't done any coding involving OFS.Cache though. Looking
>>>> at it briefly it looks like one can modify the catalog to subclass
>>>> OFS.Cacheable and then use the ZCacheable_get, ZCacheable_set and
>>>> ZCacheable_invalidate methods to interact with a cache manager. This
>>>> needs to be pretty explicit though. Are there any side effects that I
>>>> should guard against if the catalog subclasses OFS.Cache?
>> I don't think so.  Here are some random thoughts on the idea:
>>
>>  -  The 'searchResults' method must pass its keyword arguments as
>>     part of the cache key.
>>
>>  - I don't know if there is a reasonable way to do 'mtime' for
>>    the catalog:  we would like to be able to get an mtime cheaply
>>    for the BTrees (indexes, the 'data' container), but I don't know
>>    if that is possible.
>>
>>  - The "right" place to do this feels like the 'searchResults' of
>>    ZCatalog, just before it calls 'self._catalog.searchResults'.
>>
>>  - The CMF's catalog overrides 'searchResults', but calls it at
>>    the end, so everything there should work.
> 
> Hmm, on further thought:
> 
>  - It isn't safe to stash persistent objects in the RAM Cache manager,
>    because they can't be used safely from another database connection.
> 
>  - The result set you get back from a query is a "lazy", which will
>    be consumed by each client:  no two clients will see the same
>    thing.
> 
> Maybe this won't work, after all.

I have had little exposure to the actual innards of the Zope 2 catalog, 
so this might be too wild for good old Zope 2, but here it goes:

In Zope3, catalog indices actually only deal with integers (int ids) 
that only *represent* actual objects (this is for optimization reasons, 
and so they don't actually have to hang on to persistent objects 
themselves). This way a Zope 3 catalog search result is actually a list 
of integers, not a list of objects (or brains, or whatever). Of couse, 
you can use that list of integers to look up the corresponding objects 
with an integer ID utility (and there is a convenience API for that, but 
that's not important).

A wild guess is that the Zope 2 index does the same or at least 
somethign similar. Perhaps not in a nice componentized manner like Zope 
3 does (using a separate utility for the int id mapping), but I do 
recall the ZCatalog storing "uids". RAM caching those integers should be 
absolutely possible. Heck, they don't even need much space and can be 
kept efficiently in data structures...

It may require a bit of hacking the catalog, of course. Perhaps it's 
time to start thinking about componentizing the Zope 2 catalog to make 
such things easier in the future?

-- 
http://worldcookery.com -- Professional Zope documentation and training
Next Zope 3 training at Camp5: http://trizpug.org/boot-camp/camp5