[ZODB-Dev] Speeding up ZODB (was "redis cache for RelStorage")

Jim Fulton jim at zope.com
Thu May 5 13:29:12 EDT 2011


On Thu, May 5, 2011 at 4:43 AM, Pedro Ferreira
<jose.pedro.ferreira at cern.ch> wrote:

...

> Since we are talking about speed, does anyone have any tips on making
> ZODB (in general) faster?

It's hard to give general advice.  The basic thing to be aware of is
that lods from ZEO cache have some cost, but are fairly cheap
especially if your ZEO cache can fit in the local operating-system
disk cache. (You may want to add RAM to your clients.)

Loads from the storage server are quite a bit more expensive. If you
have a large database (100s of Gigs) then disk access times on your
storage server can also be a major factor.

Effective use of ZEO cache can make a big difference.  To decide how
big your ZEO cache should be, turn on ZEO cache tracing, starting from
an empty cache, and use the ZEO/scripts/cache_simul.py to experiment
with different cache sizes.

To enable cache tracing, run your application with the ZEO_CACHE_TRACE
environment variable set to a non-empty value.

> In our project, the DB is apparently the
> bottleneck, and we are considering implementing a memcache layer in
> order to avoid fetching so often from DB.

You should look at your ZEO cache configuration first.  You may be
able to improve performance quite a bit by simply increasing your ZEO
cache sizes.  If you have larger ZEO caches, you should probably use
persistent caches. You probably also want to set:

- drop-cache-rather-verify to true on the client
- set invalidation-age on the server to at least an hour or two so
  that you deal with being disconnected from the storage server for a
  reasonable period of time without having to verify.

If you haven't, make sure binary data like photos, movies, whatever
are stored in blobs.

Consider compression your database records:

  http://pypi.python.org/pypi/zc.zlibstorage

Not only will thhis save disk space, but it will allow more of your
database to fit in the storage server's disk cache.  It will allow
your ZEO caches to store 2-3 times as many records for a given cache
size.

A disadvantage of ZEO caches is that they aren't shared between
processes and I've been thinking of ways to leverage something like
memcached.

> However, we were also
> wondering if we could in some way take advantage of different computer
> hardware - since the ZEO server is mostly single-threaded we thought of
> getting a machine with higher clock freq and larger cache rather than a
> commodity 8-core server (which is what we are using now).

Is your storage server CPU bound?  Starting with ZODB 3.10, ZEO
storage servers are multi-threaded. They have a thread for each
client.  We have a storage server that has run at 120% cpu on a 4-core
box.  Also, if you use zc.FileStorage, packing is mostly done in a
separate process.

> Any tips on the kind of hardware that performs best with ZODB/ZEO?

A major source of slow down *can* be disk access times. How's IO wait
on your server?  If IO wait is high, then consider adding ram to get a
larger cache or moving the database to an ssd. We're running our
largest most active database on an SSD.  (Blobs are still on magnetic
disk.) Again, compression can help a lot here, allowing databases to
fit on ssd that otherwise wouldn't.

> Are
> there any adjustments that can be done at the OS or even application
> layer that might improve performance?

Look at how your application is using data. If you have requests that
have to load a lot of data, maybe you can refactor your application to
load fewer.

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton


More information about the ZODB-Dev mailing list