[ZODB-Dev] Understanding the ZODB cache-size option

Mon Mar 22 15:01:30 EDT 2010

On Mon, Mar 22, 2010 at 7:05 PM, Jeff Shell <jeff at bottlerocket.net> wrote:
> Are there any metrics about how to set the ZODB 'cache-size' (or cache-size-bytes) option? We've been using '5000' (arbitrarily chosen) for our Zope 3.4-ish app servers. We have access to zc.z3monitor which can output the "number of objects in the object caches (combined)" and "number of non-ghost objects in the object caches (combined)".
>
> But I don't know understand how to interpret those numbers and use them to make better settings.

ZODB cache size optimization is a typical case of black art. There's
no real good way to find the perfect number. I would advise against
using the "cache-size-bytes" option. There's a known critical problem
with it, that is currently only fixed on the 3.9 SVN branch. So stick
to the old object count cache-size for now.

But generally database caching is a trade-off between performance and
available RAM. As the upper limit, you could have your entire data set
in each ZODB cache. So you could look at the number of persistent
objects in the database and match your cache-size to that number.
That's usually not want you want.

As a real strategy, you should set up detailed monitoring of the
server. Monitor and graph overall RAM usage, RAM usage per Zope
process, number of DB loads and writes over time. Preferably include
some way of measuring application request performance and track CPU
and I/O usage on the server hosting the database.

If you have those numbers, you can play around with the cache setting
and increase it. See what impact it has on your application and data
set. At some point you run out of memory and need to decrease the
number or the increased cache size doesn't actually buy you any
application performance anymore.

For general Zope 3 application there are no rules of thumb that I
know. The dataset and load patterns of the applications are too
different to have any of those. It's affected a lot by the dataset and
if you use ZODB blobs or another mechanism to store large binary
content outside the DB. 5000 persistent objects including 1000 images
of 5mb each are obviously very different, than 1000 BTree buckets
containing only integers. One main advantage of blobs is that they
aren't loaded into the ZODB connection caches, so they lower the
memory requirements for applications with binary content a lot.

In the Plone context I generally use something like "number of content
objects in the catalog" + 5000 objects for the general application as
a starting point. But that has a lot of assumptions on the type of
content and the application in it.

Hanno