[ZODB-Dev] ZEO and relstporage performance

Laurence Rowe l at lrowe.co.uk
Tue Oct 13 19:11:08 EDT 2009


Shane's earlier benchmarks show MySQL to be the fastest RelStorage backend:
http://shane.willowrise.com/archives/relstorage-10-and-measurements/

Laurence

2009/10/13 Ross J. Reedstrom <reedstrm at rice.edu>:
> Very interesting. I wonder how the postgresql version fairs?
>
> Ross
>
>
> On Tue, Oct 13, 2009 at 05:08:07PM -0400, Jim Fulton wrote:
>> I've been working on a project to speed up ZEO.  The speedup mainly
>> involves getting ZEO to use more threads by giving each client it's
>> own thread, and changing FileStorage to allow multiple simultaneous
>> readers.  This is especially valuable for us (ZC) for large databases
>> (~1TB) running on multi-splindle storage systems on which multiple
>> reads of the same file can take place in parallel.  I'll have more to
>> say about this work in later posts.
>>
>> In the course of working on this, I decided to play with Shane's
>> relstorage benchmark, speedtest.  After playing with it a bit, I have
>> a few observations.
>>
>> - Up to a point, it does a good job of isolating just the networking
>>   aspects of the mysql and ZEO protocols:
>>
>>   - It uses a small enough data set to fit in ram, so the read portion
>>     of the tests does no disk IO.
>>
>>   - It doesn't leverage ZODB or ZEO caches at all. (Although ZEO read
>>     times are penalized by the time taken to write to the ZEO cache
>>     locally.)
>>
>> - The tests run clients and servers on the same machine using Unix
>>   Domain Sockets for communication (at least for ZEO and MySQL).
>>   Generally, at least in deployments we do, the clients and servers
>>   run on different machines.
>>
>> - When running at high concurrency levels, the clients and server can
>>   compete for CPU recourses, distorting results.  This wouldn't happen
>>   of the clients ran on separate machines.
>>
>> - Minor nit: the tests notion of object's per transaction is off. The
>>   actual number reported is on the order of 1/30 of the numbers the
>>   numbers reported by the tests.
>>
>> I decided to explore this a bit.  I modified shanes speedtest script
>> on a branch:
>>
>> - Added command line options to control a number of factors, like
>>   object sizes and concurrency levels.
>>
>> - Added options to specify mysql connection parameters.  Among other
>>   things, this lets me run the test in a "remote" configuration, in
>>   which the client and server are on different machines.
>>
>> - Added an option to specify a ZEO TCP address and to manage a ZEO
>>   server externally.
>>
>> - Replaced the single read measurement with "cold", "hot" and "steamin"
>>   measurements. The "cold" number is what Shane's test originally called
>>   "read".  It reads data from the server without benefit of the ZODB
>>   or ZEO caches.
>>
>>   The "hot" number provides timings for a second round of reads
>>   after minimizing the object cache.
>>
>>   The "steamin" number is the timing of a 3rd round of reads without
>>   clearing the ZODB cache. I upped the size of the ZODB cache to make
>>   sure the objects woould fit.
>>
>> Here are some results.  I'm going to provide them in tabular form, as
>> I actually find this easier than charts for this data and also because
>> it's less work. :) The results below are basically as output by his
>> script with my modifications.
>>
>> First, here are results from running clients and server on the same
>> machine using unix domain sockets.  The results are grouped onto 3
>> tables based on objects per transaction.  Note that for the second and
>> third tables I've added the actual object counts. The machine these
>> were run on was a 2.2Ghz Intel Core 2 Duo (two core) desktop with a
>> SATA disk and 4GB of ram and running Ubuntu 9.04.  They used
>> relstorage trunk as of October 5, when I made by branch and using ZODB
>> 3.9.1.  The results also reflect the default relstorage poll interval
>> of 0.  More on that later.  The results also reflect mysql
>> configured to improve write performance as described here:
>> http://shane.willowrise.com/archives/how-to-fix-the-mysql-write-speed/.
>>
>> The first column is the concurrency level, which is the number of
>> simultaneous clients.  The remaining columns are in 2 groups of 4, for
>> ZEO and for MySQLAdapter (reslstorage+mysql).  Each group has a write
>> time, a cold read time, a hot read time (second set of reads after
>> clearing the ZODB objects cache) and a steamin time based on a 3rd set
>> of reads without clearing the object cache.
>>
>>
>> Columns:
>> "Concurrency",
>>  ZEO + FileStorage - write,
>>  ZEO + FileStorage - cold,
>>  ZEO + FileStorage - hot,
>>  ZEO + FileStorage - steamin,
>>  MySQLAdapter - write,
>>  MySQLAdapter - cold,
>>  MySQLAdapter - hot,
>>  MySQLAdapter - steamin
>>
>>
>> Local clients, poll interval 0
>> ==============================
>>
>> ** Results with objects_per_txn=1 **
>>    ZEO+FS --------------------------   MySQL-----------------------------
>>    write    cold     hot      steamin  write    cold     hot      steamin
>> 1, 0.00992, 0.00108, 0.00015, 0.00007, 0.00405, 0.00129, 0.00076, 0.00043
>> 2, 0.01359, 0.00177, 0.00024, 0.00011, 0.00635, 0.00083, 0.00043, 0.00024
>> 4, 0.02322, 0.00226, 0.00025, 0.00011, 0.00836, 0.00128, 0.00047, 0.00025
>> 8, 0.07687, 0.00183, 0.00020, 0.00009, 0.01236, 0.00121, 0.00055, 0.00036
>> 16, 0.25414, 0.00259, 0.00018, 0.00007, 0.02846, 0.00130, 0.00056, 0.00032
>>
>> ** Results with objects_per_txn=100 (REALLY 4) **
>>    ZEO+FS --------------------------   MySQL-----------------------------
>>    write    cold     hot      steamin  write    cold     hot      steamin
>> 1, 0.01352, 0.00574, 0.00062, 0.00017, 0.00841, 0.00273, 0.00159, 0.00043
>> 2, 0.02414, 0.00539, 0.00035, 0.00008, 0.00678, 0.00292, 0.00202, 0.00045
>> 4, 0.03136, 0.00789, 0.00035, 0.00007, 0.01343, 0.00198, 0.00108, 0.00025
>> 8, 0.09697, 0.00694, 0.00036, 0.00008, 0.01910, 0.00253, 0.00111, 0.00025
>> 16, 0.24361, 0.01369, 0.00037, 0.00008, 0.03413, 0.00363, 0.00158, 0.00036
>>
>> ** Results with objects_per_txn=10000 (REALLY 334) **
>>    ZEO+FS --------------------------   MySQL-----------------------------
>>    write    cold     hot      steamin  write    cold     hot      steamin
>> 1, 0.13877, 0.40306, 0.02324, 0.00042, 0.11370, 0.09461, 0.05026, 0.00063
>> 2, 0.18004, 0.39529, 0.02051, 0.00045, 0.12573, 0.10313, 0.07746, 0.00072
>> 4, 0.36065, 0.38792, 0.02192, 0.00050, 0.25860, 0.21972, 0.14529, 0.00150
>> 8, 0.68353, 1.57573, 0.02679, 0.00110, 0.51280, 0.44516, 0.45004, 0.00126
>> 16, 1.46470, 3.40687, 0.03225, 0.00057, 1.00606, 1.03924, 1.29605, 0.00102
>>
>> As you can see, write and cold read times are quite a bit higher for
>> ZEO, although write times get closer together as transaction size and
>> concurrency increases.
>>
>> Also note that the hot times are much lower for ZEO than with MySQLAdapter.
>> Our ZEO cache hit rates are typically around 90%.  With a cache hot
>> rate of only 75% I'd expect ZEO+FS to generally outperform MySQLAdapter.
>>
>> The steamin times are also quite a bit lower for ZEO+FS that for
>> mysql.  This is a it surprising since data are simply being read from
>> the ZODB object cache, but the overhead of polling for changes slows
>> down these accesses.  Ideally, ZEO OBject cache hit rates are high, so
>> the steamin times are highly relevent to actual application
>> performance.
>>
>> I shared this data with Shane who suggested running with a poll
>> interval of 2.  Here are the results with a poll interval of 2.
>>
>> Local clients, poll interval 2
>> ==============================
>>
>> ** Results with objects_per_txn=1 **
>> 1, 0.00920, 0.00163, 0.00024, 0.00011, 0.00419, 0.00102, 0.00050, 0.00015
>> 2, 0.01381, 0.00143, 0.00021, 0.00010, 0.00425, 0.00110, 0.00057, 0.00015
>> 4, 0.03010, 0.00153, 0.00015, 0.00007, 0.00505, 0.00123, 0.00051, 0.00013
>> 8, 0.06913, 0.00145, 0.00017, 0.00008, 0.01171, 0.00127, 0.00038, 0.00008
>> 16, 0.21394, 0.00308, 0.00017, 0.00007, 0.02466, 0.00225, 0.00037, 0.00008
>>
>> ** Results with objects_per_txn=100 (REALLY 4) **
>> 1, 0.01582, 0.00571, 0.00066, 0.00013, 0.00532, 0.00249, 0.00131, 0.00015
>> 2, 0.01774, 0.00612, 0.00062, 0.00013, 0.00704, 0.00244, 0.00098, 0.00009
>> 4, 0.02779, 0.00710, 0.00055, 0.00012, 0.00741, 0.00384, 0.00143, 0.00009
>> 8, 0.08021, 0.01067, 0.00035, 0.00007, 0.01639, 0.00323, 0.00100, 0.00009
>> 16, 0.26911, 0.01602, 0.00038, 0.00007, 0.03164, 0.00462, 0.00101, 0.00009
>>
>> ** Results with objects_per_txn=10000 (REALLY 334) **
>> 1, 0.16153, 0.40147, 0.02417, 0.00042, 0.11959, 0.10012, 0.05048, 0.00045
>> 2, 0.18652, 0.39361, 0.02055, 0.00044, 0.12947, 0.10604, 0.08080, 0.00047
>> 4, 0.33065, 0.84091, 0.02331, 0.00050, 0.25859, 0.21675, 0.13139, 0.00052
>> 8, 0.67337, 1.46541, 0.02905, 0.00069, 0.49674, 0.42905, 0.44064, 0.00063
>> 16, 1.46586, 3.67101, 0.03427, 0.00097, 0.99446, 1.06484, 1.16689, 0.00078
>>
>> Here the steamin times are are very similar for ZEO and MySQLAdapter,
>> although the ZEO+FS times are a bit lower.  Note however, that using a
>> poll interval of 2 may cause excessive conflict errors, especially if
>> there are relatively hot objects that get updated a lot.
>>
>> In our deployments, the clients are on separate machines and generally
>> don't compete with each other or with each other for CPU resources.
>> The tables blow show results with clients running on a separate 8-core
>> 2.33Ghz Xeon (dual quad core) machine with 24G of memory and running
>> Centos 4.7.  There was plenty of CPU resources for the clients so they
>> never came close to using all of the available CPU resources.
>>
>> Remote clients, poll interval 2
>> ==============================
>>
>> ** Results with objects_per_txn=1 **
>> 1, 0.03733, 0.00207, 0.00015, 0.00007, 0.01905, 0.00240, 0.00141, 0.00008
>> 2, 0.01772, 0.00233, 0.00015, 0.00007, 0.01962, 0.00240, 0.00147, 0.00008
>> 4, 0.06634, 0.00236, 0.00015, 0.00007, 0.03471, 0.00262, 0.00162, 0.00008
>> 8, 0.08080, 0.00364, 0.00016, 0.00007, 0.06410, 0.00287, 0.00164, 0.00008
>> 16, 0.09270, 0.00440, 0.00016, 0.00007, 0.13171, 0.00316, 0.00174, 0.00009
>>
>> ** Results with objects_per_txn=100 (REALLY 4) **
>> 1, 0.01809, 0.00683, 0.00034, 0.00007, 0.02432, 0.00597, 0.00480, 0.00008
>> 2, 0.02210, 0.00816, 0.00034, 0.00007, 0.02873, 0.00645, 0.00513, 0.00008
>> 4, 0.07079, 0.00991, 0.00036, 0.00007, 0.03521, 0.00655, 0.00520, 0.00009
>> 8, 0.08739, 0.01388, 0.00035, 0.00007, 0.06754, 0.00706, 0.00557, 0.00009
>> 16, 0.09264, 0.01376, 0.00035, 0.00007, 0.13904, 0.00777, 0.00593, 0.00010
>>
>> ** Results with objects_per_txn=10000 (REALLY 334) **
>> 1, 0.17738, 0.57640, 0.01969, 0.00038, 0.61835, 0.47054, 0.39015, 0.00041
>> 2, 0.20881, 0.67896, 0.01973, 0.00038, 0.65081, 0.45832, 0.39691, 0.00043
>> 4, 0.28996, 0.92163, 0.01993, 0.00038, 0.70280, 0.47962, 0.41136, 0.00044
>> 8, 0.41571, 1.25167, 0.02008, 0.00040, 0.81672, 0.50079, 0.50144, 0.00045
>> 16, 0.60316, 1.54352, 0.02033, 0.00039, 1.23906, 0.60130, 0.68200, 0.00049
>>
>>
>> Some things to note:
>>
>> - For smaller transaction sizes, ZEO+FS and MySQLAdapter write times
>>   are pretty close, however at higher levels of concurrency or for
>>   large transaction sizes, ZEO+FS outperforms MySQLAdapter on writes.
>>
>> - For smaller transaction sizes, ZEO+FS and MySQLAdapter cold read
>>   times are pretty close. Even for larger transaction sizes, the cold
>>   read times are pretty close, except at the highest concurrency
>>   level.  I think what's happening for high concurrency and large
>>   transaction sizes is that ZEO has reached maximum throughput and the
>>   MySQLAdapter still has some breathing room.
>>
>> - The hot times are more than an order of magnitude better for
>>   ZEO+FS.
>>
>> These benchmarks make ZEO+FS look pretty good relative to
>> MySQLAdapter.  The overall performance assuming even moderate;y
>> effective ZEO pr object caches is significantly better for ZEO.
>> Keep in mind, however, that these benchmarks don't take
>> disk access on the server into account for reads, because there isn't
>> any.  In practice, I'd expect server disk access times to dominate
>> cold read times.  For example, in a separate benchmark with far more
>> realistic access patterns against a large database, object load times
>> are an order of machnitude greater than what you'd see if the data
>> being read was all in RAM.
>>
>> Jim
>>
>> --
>> Jim Fulton
>> _______________________________________________
>> For more information about ZODB, see the ZODB Wiki:
>> http://www.zope.org/Wikis/ZODB/
>>
>> ZODB-Dev mailing list  -  ZODB-Dev at zope.org
>> https://mail.zope.org/mailman/listinfo/zodb-dev
>>
> _______________________________________________
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
>
> ZODB-Dev mailing list  -  ZODB-Dev at zope.org
> https://mail.zope.org/mailman/listinfo/zodb-dev
>


More information about the ZODB-Dev mailing list