[ZODB-Dev] Memcache or ZEO cache (Re: ZEO and relstporage performance)

Jim Fulton jim at zope.com
Wed Oct 14 07:25:00 EDT 2009


On Tue, Oct 13, 2009 at 8:30 PM, Shane Hathaway <shane at hathawaymix.org> wrote:
> This leads to an interesting question.  Memcached or ZEO cache--which is
> better?

For what? For relstorage? or for ZEO?

> While memcached has a higher minimum performance penalty, it
> also has a lower maximum penalty, since memcached hits never have to
> wait for disk.

With modern ram configurations, it's likely that you don't wait for
disk on read access, as reads are likely satisfied from disk for the
ZEO cache. That may explain why the ZEO cache is faster in your tests.
In the speedtest, data are almost certainly read from memory for both
memcached and the ZEO cache, but memcached also has IPC overhead.

> Also, memcached can be shared among processes,

That's certainly a big potential win.

> there is
> a large development community around memcached,

That doesn't impress me all that much in this case.  The part of the
ZEO cache code that overlaps memcache is pretty simple.

The most complicated logic in the ZEO cache, which would be just as
complicated with another cache storage implementation and more
complicated with a shared cache storage is making sure the cache
doesn't have stale state.  I probably need to look at memcache again,
but every time I look at it, it's not at all clear how to prevent
reading stale data as current.  At some point, I should look at the
approach you took In relstorage.

> and memcached creates
> opportunities for developers to be creative with caching strategies.

How so?

The biggest problem with the ZEO cache that I'm aware of today is that
it doesn't take access patterns into account when evicting data from
the cache.  As things are now, it doesn't have very accurate
information about access patterns. The most valuable objects stay in
the object cache, so the ZEO cache rarely sees requests for them.  In
the future, I plan try modifying the cache eviction code to avoid
evicting objects that are in the object cache, although it's not at
all clear how much of a win this would be. (It would almost certainly
improve startup performance with a persistent cache.) If the cache can
be made more effective, then it can also be made smaller, lessening
the benefit of a shared cache.

The biggest problem with ZEO performance on the client side is that
reads require round trips and that generally a client thread only
knows to request one read at a time [1]_.  I plan to add an API for
asynchronous reads.  In rare situations in which an application knows
it's going to need more than one object, it can prefetch multiple
objects at once.  (One can imagine iteration scenarios in which this
would be easy to predict.)  An opportunity that this would provide
would be to pre-fetch object revisions for objects that were in the
ZODB cache and have just been invalidated.

.. [1] There's a related and fairly easy to fix problem that currently
a ZEO client only makes one read
        request at a time, which hurts ZEO clients with multiple
application threads.



-- 
Jim Fulton


More information about the ZODB-Dev mailing list