[ZODB-Dev] ZEO's stats.py and simul.py

Tim Peters tim at zope.com
Wed Mar 30 11:31:43 EST 2005


[Chris Withers]
> Does anyone know if the stats.py and simul.py that ship in Zope 2.7.2's
> ZEO package are up to date?

I don't know.  The last time I know of that anyone tried to use these was in
2003, when Jeremy & I were studying alternative cache designs.  I know they
broke badly when Shane added support for APE oids, but don't know whether
that got fixed.

> I turned on cache tracing and tried to analyze the resulting logs using
> these two tools.
>
> stats.py told me I had a 98% hit rate, while
> http://www.zope.org/Wikis/ZODB/HowtoTraceZEOCache/trace.html says the
> theoretical maximum is 90%.

I can't read "90% is probably close to the theoretical maximum" that way.
Sounds like nonsense anyway, if read literally.  IIRC, 90% was actually just
the largest "theoretically perfect" hit rate we measured on actual trace
files from various Zope sites.  A hit rate arbitrarily close to 100% is
certainly possible.

> simul.py barfed out on the same log file
>
> Traceback (most recent call last):
>    File "C:\zope\2.7.2\lib\python\ZEO\simul.py", line 758, in ?
>      sys.exit(main())
>    File "C:\zope\2.7.2\lib\python\ZEO\simul.py", line 132, in main
>      sim.event(ts, dlen, version, code, current, oid, serial)
>    File "C:\zope\2.7.2\lib\python\ZEO\simul.py", line 198, in event
>      self.report()
>    File "C:\zope\2.7.2\lib\python\ZEO\simul.py", line 228, in report
>      print self.format % (
> ValueError: unconvertible time

I think that answers your original question <wink>.

> In any case, what's the best way to determine how large the ClientStorage
> cache_size should be set?

Try various sizes and judge results against whatever function you're trying
to optimize.

> Are there any penalties for setting it too large?

The obvious one is more disk space required.  If you use a persistent ZEO
cache, then cache verification time at ZEO client connect/reconnect times
may also increase proportionately.  Other than those, bigger is probably
better, and the 20MB (? whatever) default size is much smaller than usually
desirable (it's left over from days when typical disks were much smaller
than they are now).  Try, e.g., 200MB.  Like the results better?  Iterate.

Note that while the ZEO cache is disk-based, it does have in-memory index
structures taking space proportional to the number of objects cached.  I
suppose that if the cache file were big enough to hold millions of objects,
the RAM consumed by those indices could become burdensome.  Haven't heard of
that happening in real life, though.

Of course all files in active use compete for your OS's own disk caching
resources.



More information about the ZODB-Dev mailing list