[ZODB-Dev] documentation on the various caching options

Thu Jan 10 08:01:35 UTC 2013

Le mercredi 09 janvier 2013 19:16:02, Claudiu Saftoiu a écrit :
> zodb_uri =
> zeo://%(here)s/zeo.sock?cache_size=1000MB&connection_cache_size=100000&conn
> ection_pool_size=30
[...]
> - What is cache-size?

This is probably the "connection_cache_size" parameter.
It is size in number of objects (because it's highly non-trivial to compute 
the size of an object), and counts the number of live objects connection tries 
to not exceed.
A live object is a python object comming from ZODB that you can use in your 
code (as "a" in "a.b").

The actual number of live object can exceed this limit if there are 
uncommitted changes to them (so evicting them would loose those changes), or 
if there are strong references to them and they cannot be garbage-collected 
(as in, building a python list of objects: no entry can be freed until the 
list is).

As a connection also represents a snapshot of the database, each connection 
has its own copy of this cache. This can matter a lot if you want many 
parallel transactions.

connection_pool_size is related to that last point: when a transaction ends, 
it returns its connection to a pool, so that it can be recycled rather than 
created from scratch on next transaction and loosing the whole attached cache. 
Here again, pool size is just a hint and not a hard limit. If you go above the 
limit, you'll get warning logs, and if it increases further you'll get error-
level logs. Set it to a value which corresponds to the number of transactions 
you wish to have running concurrently (in Zope, that would be the number of 
worker threads), so you know if something somewhere opens extra connections 
(bypassing transaction isolation somewhat).

> - What is cache-size-bytes?

This must be "cache_size" parameter.
It sizes the pickle cache, ie a cache containing serialised representation of 
objects. Size of a pickled objet is trivial to compute, as it's just a byte 
string.

It is here to avoid a network round-trip to ask zeo process for the object 
data.

This cache is shared by all concurrent transactions.

> - Are there any other caching options?

The option to make pickle cache persistent comes to mind. There may be others, 
I don't recall right now.

> - Where would they go in my `zeo.conf`, or if they can't go there, what
> should I change so I can configure this? I tried putting them under <zeo>,
> <filestorage>, and <blobstorage>, but I get "not a known key name" errors.

Those parameters belong to the client part of ZODB, not the server (zeo) part.
Zeo has an option nevertheless, the invalidation queue size. This is used when 
a client reconnects to zeo and needs to know if there are cached entries it 
needs to flush (ie, objects which were modified by other clients). Client 
remembers the latest invalidation it received, and if it's stillin the 
invalidation queue then client can replay further invalidation requests and 
evicts just modified objects. Otherwise, it needs to flush the whole cache.

I don't think there are other options on zeo side. I encourage you to read the 
code to be sure (no need to read all of it, but there are helpful docs, 
comments & docstrigns all over ZODB code). Check http://www.zodb.org too.

-- 
Vincent Pelletier