[ZODB-Dev] ZEO and relstporage performance

Ross J. Reedstrom reedstrm at rice.edu
Tue Oct 13 18:48:38 EDT 2009


Very interesting. I wonder how the postgresql version fairs?

Ross


On Tue, Oct 13, 2009 at 05:08:07PM -0400, Jim Fulton wrote:
> I've been working on a project to speed up ZEO.  The speedup mainly
> involves getting ZEO to use more threads by giving each client it's
> own thread, and changing FileStorage to allow multiple simultaneous
> readers.  This is especially valuable for us (ZC) for large databases
> (~1TB) running on multi-splindle storage systems on which multiple
> reads of the same file can take place in parallel.  I'll have more to
> say about this work in later posts.
> 
> In the course of working on this, I decided to play with Shane's
> relstorage benchmark, speedtest.  After playing with it a bit, I have
> a few observations.
> 
> - Up to a point, it does a good job of isolating just the networking
>   aspects of the mysql and ZEO protocols:
> 
>   - It uses a small enough data set to fit in ram, so the read portion
>     of the tests does no disk IO.
> 
>   - It doesn't leverage ZODB or ZEO caches at all. (Although ZEO read
>     times are penalized by the time taken to write to the ZEO cache
>     locally.)
> 
> - The tests run clients and servers on the same machine using Unix
>   Domain Sockets for communication (at least for ZEO and MySQL).
>   Generally, at least in deployments we do, the clients and servers
>   run on different machines.
> 
> - When running at high concurrency levels, the clients and server can
>   compete for CPU recourses, distorting results.  This wouldn't happen
>   of the clients ran on separate machines.
> 
> - Minor nit: the tests notion of object's per transaction is off. The
>   actual number reported is on the order of 1/30 of the numbers the
>   numbers reported by the tests.
> 
> I decided to explore this a bit.  I modified shanes speedtest script
> on a branch:
> 
> - Added command line options to control a number of factors, like
>   object sizes and concurrency levels.
> 
> - Added options to specify mysql connection parameters.  Among other
>   things, this lets me run the test in a "remote" configuration, in
>   which the client and server are on different machines.
> 
> - Added an option to specify a ZEO TCP address and to manage a ZEO
>   server externally.
> 
> - Replaced the single read measurement with "cold", "hot" and "steamin"
>   measurements. The "cold" number is what Shane's test originally called
>   "read".  It reads data from the server without benefit of the ZODB
>   or ZEO caches.
> 
>   The "hot" number provides timings for a second round of reads
>   after minimizing the object cache.
> 
>   The "steamin" number is the timing of a 3rd round of reads without
>   clearing the ZODB cache. I upped the size of the ZODB cache to make
>   sure the objects woould fit.
> 
> Here are some results.  I'm going to provide them in tabular form, as
> I actually find this easier than charts for this data and also because
> it's less work. :) The results below are basically as output by his
> script with my modifications.
> 
> First, here are results from running clients and server on the same
> machine using unix domain sockets.  The results are grouped onto 3
> tables based on objects per transaction.  Note that for the second and
> third tables I've added the actual object counts. The machine these
> were run on was a 2.2Ghz Intel Core 2 Duo (two core) desktop with a
> SATA disk and 4GB of ram and running Ubuntu 9.04.  They used
> relstorage trunk as of October 5, when I made by branch and using ZODB
> 3.9.1.  The results also reflect the default relstorage poll interval
> of 0.  More on that later.  The results also reflect mysql
> configured to improve write performance as described here:
> http://shane.willowrise.com/archives/how-to-fix-the-mysql-write-speed/.
> 
> The first column is the concurrency level, which is the number of
> simultaneous clients.  The remaining columns are in 2 groups of 4, for
> ZEO and for MySQLAdapter (reslstorage+mysql).  Each group has a write
> time, a cold read time, a hot read time (second set of reads after
> clearing the ZODB objects cache) and a steamin time based on a 3rd set
> of reads without clearing the object cache.
> 
> 
> Columns:
> "Concurrency",
>  ZEO + FileStorage - write,
>  ZEO + FileStorage - cold,
>  ZEO + FileStorage - hot,
>  ZEO + FileStorage - steamin,
>  MySQLAdapter - write,
>  MySQLAdapter - cold,
>  MySQLAdapter - hot,
>  MySQLAdapter - steamin
> 
> 
> Local clients, poll interval 0
> ==============================
> 
> ** Results with objects_per_txn=1 **
>    ZEO+FS --------------------------   MySQL-----------------------------
>    write    cold     hot      steamin  write    cold     hot      steamin
> 1, 0.00992, 0.00108, 0.00015, 0.00007, 0.00405, 0.00129, 0.00076, 0.00043
> 2, 0.01359, 0.00177, 0.00024, 0.00011, 0.00635, 0.00083, 0.00043, 0.00024
> 4, 0.02322, 0.00226, 0.00025, 0.00011, 0.00836, 0.00128, 0.00047, 0.00025
> 8, 0.07687, 0.00183, 0.00020, 0.00009, 0.01236, 0.00121, 0.00055, 0.00036
> 16, 0.25414, 0.00259, 0.00018, 0.00007, 0.02846, 0.00130, 0.00056, 0.00032
> 
> ** Results with objects_per_txn=100 (REALLY 4) **
>    ZEO+FS --------------------------   MySQL-----------------------------
>    write    cold     hot      steamin  write    cold     hot      steamin
> 1, 0.01352, 0.00574, 0.00062, 0.00017, 0.00841, 0.00273, 0.00159, 0.00043
> 2, 0.02414, 0.00539, 0.00035, 0.00008, 0.00678, 0.00292, 0.00202, 0.00045
> 4, 0.03136, 0.00789, 0.00035, 0.00007, 0.01343, 0.00198, 0.00108, 0.00025
> 8, 0.09697, 0.00694, 0.00036, 0.00008, 0.01910, 0.00253, 0.00111, 0.00025
> 16, 0.24361, 0.01369, 0.00037, 0.00008, 0.03413, 0.00363, 0.00158, 0.00036
> 
> ** Results with objects_per_txn=10000 (REALLY 334) **
>    ZEO+FS --------------------------   MySQL-----------------------------
>    write    cold     hot      steamin  write    cold     hot      steamin
> 1, 0.13877, 0.40306, 0.02324, 0.00042, 0.11370, 0.09461, 0.05026, 0.00063
> 2, 0.18004, 0.39529, 0.02051, 0.00045, 0.12573, 0.10313, 0.07746, 0.00072
> 4, 0.36065, 0.38792, 0.02192, 0.00050, 0.25860, 0.21972, 0.14529, 0.00150
> 8, 0.68353, 1.57573, 0.02679, 0.00110, 0.51280, 0.44516, 0.45004, 0.00126
> 16, 1.46470, 3.40687, 0.03225, 0.00057, 1.00606, 1.03924, 1.29605, 0.00102
> 
> As you can see, write and cold read times are quite a bit higher for
> ZEO, although write times get closer together as transaction size and
> concurrency increases.
> 
> Also note that the hot times are much lower for ZEO than with MySQLAdapter.
> Our ZEO cache hit rates are typically around 90%.  With a cache hot
> rate of only 75% I'd expect ZEO+FS to generally outperform MySQLAdapter.
> 
> The steamin times are also quite a bit lower for ZEO+FS that for
> mysql.  This is a it surprising since data are simply being read from
> the ZODB object cache, but the overhead of polling for changes slows
> down these accesses.  Ideally, ZEO OBject cache hit rates are high, so
> the steamin times are highly relevent to actual application
> performance.
> 
> I shared this data with Shane who suggested running with a poll
> interval of 2.  Here are the results with a poll interval of 2.
> 
> Local clients, poll interval 2
> ==============================
> 
> ** Results with objects_per_txn=1 **
> 1, 0.00920, 0.00163, 0.00024, 0.00011, 0.00419, 0.00102, 0.00050, 0.00015
> 2, 0.01381, 0.00143, 0.00021, 0.00010, 0.00425, 0.00110, 0.00057, 0.00015
> 4, 0.03010, 0.00153, 0.00015, 0.00007, 0.00505, 0.00123, 0.00051, 0.00013
> 8, 0.06913, 0.00145, 0.00017, 0.00008, 0.01171, 0.00127, 0.00038, 0.00008
> 16, 0.21394, 0.00308, 0.00017, 0.00007, 0.02466, 0.00225, 0.00037, 0.00008
> 
> ** Results with objects_per_txn=100 (REALLY 4) **
> 1, 0.01582, 0.00571, 0.00066, 0.00013, 0.00532, 0.00249, 0.00131, 0.00015
> 2, 0.01774, 0.00612, 0.00062, 0.00013, 0.00704, 0.00244, 0.00098, 0.00009
> 4, 0.02779, 0.00710, 0.00055, 0.00012, 0.00741, 0.00384, 0.00143, 0.00009
> 8, 0.08021, 0.01067, 0.00035, 0.00007, 0.01639, 0.00323, 0.00100, 0.00009
> 16, 0.26911, 0.01602, 0.00038, 0.00007, 0.03164, 0.00462, 0.00101, 0.00009
> 
> ** Results with objects_per_txn=10000 (REALLY 334) **
> 1, 0.16153, 0.40147, 0.02417, 0.00042, 0.11959, 0.10012, 0.05048, 0.00045
> 2, 0.18652, 0.39361, 0.02055, 0.00044, 0.12947, 0.10604, 0.08080, 0.00047
> 4, 0.33065, 0.84091, 0.02331, 0.00050, 0.25859, 0.21675, 0.13139, 0.00052
> 8, 0.67337, 1.46541, 0.02905, 0.00069, 0.49674, 0.42905, 0.44064, 0.00063
> 16, 1.46586, 3.67101, 0.03427, 0.00097, 0.99446, 1.06484, 1.16689, 0.00078
> 
> Here the steamin times are are very similar for ZEO and MySQLAdapter,
> although the ZEO+FS times are a bit lower.  Note however, that using a
> poll interval of 2 may cause excessive conflict errors, especially if
> there are relatively hot objects that get updated a lot.
> 
> In our deployments, the clients are on separate machines and generally
> don't compete with each other or with each other for CPU resources.
> The tables blow show results with clients running on a separate 8-core
> 2.33Ghz Xeon (dual quad core) machine with 24G of memory and running
> Centos 4.7.  There was plenty of CPU resources for the clients so they
> never came close to using all of the available CPU resources.
> 
> Remote clients, poll interval 2
> ==============================
> 
> ** Results with objects_per_txn=1 **
> 1, 0.03733, 0.00207, 0.00015, 0.00007, 0.01905, 0.00240, 0.00141, 0.00008
> 2, 0.01772, 0.00233, 0.00015, 0.00007, 0.01962, 0.00240, 0.00147, 0.00008
> 4, 0.06634, 0.00236, 0.00015, 0.00007, 0.03471, 0.00262, 0.00162, 0.00008
> 8, 0.08080, 0.00364, 0.00016, 0.00007, 0.06410, 0.00287, 0.00164, 0.00008
> 16, 0.09270, 0.00440, 0.00016, 0.00007, 0.13171, 0.00316, 0.00174, 0.00009
> 
> ** Results with objects_per_txn=100 (REALLY 4) **
> 1, 0.01809, 0.00683, 0.00034, 0.00007, 0.02432, 0.00597, 0.00480, 0.00008
> 2, 0.02210, 0.00816, 0.00034, 0.00007, 0.02873, 0.00645, 0.00513, 0.00008
> 4, 0.07079, 0.00991, 0.00036, 0.00007, 0.03521, 0.00655, 0.00520, 0.00009
> 8, 0.08739, 0.01388, 0.00035, 0.00007, 0.06754, 0.00706, 0.00557, 0.00009
> 16, 0.09264, 0.01376, 0.00035, 0.00007, 0.13904, 0.00777, 0.00593, 0.00010
> 
> ** Results with objects_per_txn=10000 (REALLY 334) **
> 1, 0.17738, 0.57640, 0.01969, 0.00038, 0.61835, 0.47054, 0.39015, 0.00041
> 2, 0.20881, 0.67896, 0.01973, 0.00038, 0.65081, 0.45832, 0.39691, 0.00043
> 4, 0.28996, 0.92163, 0.01993, 0.00038, 0.70280, 0.47962, 0.41136, 0.00044
> 8, 0.41571, 1.25167, 0.02008, 0.00040, 0.81672, 0.50079, 0.50144, 0.00045
> 16, 0.60316, 1.54352, 0.02033, 0.00039, 1.23906, 0.60130, 0.68200, 0.00049
> 
> 
> Some things to note:
> 
> - For smaller transaction sizes, ZEO+FS and MySQLAdapter write times
>   are pretty close, however at higher levels of concurrency or for
>   large transaction sizes, ZEO+FS outperforms MySQLAdapter on writes.
> 
> - For smaller transaction sizes, ZEO+FS and MySQLAdapter cold read
>   times are pretty close. Even for larger transaction sizes, the cold
>   read times are pretty close, except at the highest concurrency
>   level.  I think what's happening for high concurrency and large
>   transaction sizes is that ZEO has reached maximum throughput and the
>   MySQLAdapter still has some breathing room.
> 
> - The hot times are more than an order of magnitude better for
>   ZEO+FS.
> 
> These benchmarks make ZEO+FS look pretty good relative to
> MySQLAdapter.  The overall performance assuming even moderate;y
> effective ZEO pr object caches is significantly better for ZEO.
> Keep in mind, however, that these benchmarks don't take
> disk access on the server into account for reads, because there isn't
> any.  In practice, I'd expect server disk access times to dominate
> cold read times.  For example, in a separate benchmark with far more
> realistic access patterns against a large database, object load times
> are an order of machnitude greater than what you'd see if the data
> being read was all in RAM.
> 
> Jim
> 
> -- 
> Jim Fulton
> _______________________________________________
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
> 
> ZODB-Dev mailing list  -  ZODB-Dev at zope.org
> https://mail.zope.org/mailman/listinfo/zodb-dev
> 


More information about the ZODB-Dev mailing list