[ZODB-Dev] Slow Zeo load times after upgrade to ZODB 3.4

Fri Oct 21 16:53:12 EDT 2005

[Dieter Maurer]
>> The ZODB cache lives in RAM and is a Python object cache. It is a per
>> connection cache.
>>
>> The ZEO client cache lives on disk and is a pickle cache (it caches the
>> object states not the objects themselves). It is shared by all
>> connections to a single storage (in one process).

[Erik A. Dahl]
> How do these two relate to each other?

When an application using ZEO tries to load a persistent object:

1. It first looks in the ZODB cache.

2. If that's a miss (the object is not in the ZODB cache),
   it consults the ZEO cache.

3. If that's a miss too (the object isn't in the ZEO cache either),
   it asks the ZEO server for the object's state.

So wrt the ZODB cache, the ZEO cache is a second-level cache.  The ZODB
cache has a least-recently used replacement policy, and the hope is that
not-recently-used objects that have been booted from the ZODB cache can
usually still be found in the ZEO cache, saving the usually much greater
expense of fetching state from the ZEO server over the network.

> Do I really need the disk cache?

If you want to use ZEO, yes.

> Can it be turned off? 

No, but you can change its size.  If you're suffering from slow network
performance, though, you probably don't want to make it smaller.

> I don't have "client" set so that each process makes its own cache
> file.

That means it's a "non-persistent cache file":  it's thrown away when the
ZEO client stops, and the ZEO client doesn't remember any objects across
restarts.

> (When I set this

Setting "client" makes it a persistent cache file:  the ZEO client then
remembers (as much as it can) object states across restarts.

There are complicated tradeoffs here.  Using a persistent cache is
potentially much faster after initial startup, but _may_ make initial
startup slower.  This is because, when using a persistent cache, the ZEO
client has to talk with the ZEO server when it starts up, to figure out
which objects have changed state since the ZEO client last talked with the
server.  A number of strategies are used to speed that negotiation, but any
or all may or may not be effective depending on details.  Separately, when
using a persistent cache the ZEO client has to scan the entire cache file on
startup in order to build an in-memory map of which objects it knows about
and where they live in the cache file.  If, e.g., you have a slow disk
and/or very large cache file, that can take appreciable time.

By the way, setting invalidation-queue-size won't help you across a client
restart unless the client is using a persistent cache file (the ZEO server
invalidation queue is one of the strategies mentioned above that aims at
speeding startup cache verification for persistent cache files).

> I have problems if two processes us the same zope.conf file).

I believe that ;-) ... ah, I see Paul Winkler already explained good
practice wrt this.

> Is there a doc somewhere explaining all of this??

Not really, at least not that I know of.  It would need to be a really fat
book to explain "all" of it.