[Checkins] SVN: zodbshootout/trunk/ Split the write test into add and update tests

Tue Nov 17 15:21:28 EST 2009

Log message for revision 105768:
  Split the write test into add and update tests
  

Changed:
  U   zodbshootout/trunk/README.txt
  U   zodbshootout/trunk/etc/sample.conf
  U   zodbshootout/trunk/src/zodbshootout/main.py

-=-
Modified: zodbshootout/trunk/README.txt
===================================================================

--- zodbshootout/trunk/README.txt	2009-11-17 20:21:25 UTC (rev 105767)
+++ zodbshootout/trunk/README.txt	2009-11-17 20:21:28 UTC (rev 105768)
@@ -1,9 +1,7 @@
-zodbshootout
-------------
 
 This application measures and compares the performance of various
 ZODB storages and configurations. It is derived from the RelStorage
-speedtest script, but this version allows more types of storages and
+speedtest script, but this version allows arbitrary storage types and
 configurations, provides more measurements, and produces numbers that
 are easier to interpret.
 
@@ -11,9 +9,19 @@
 started is to follow the directions below to set up a complete testing
 environment with sample tests.
 
-How to set up ``zodbshootout`` using Buildout
----------------------------------------------
+.. contents::
 
+Known Issues
+------------
+
+This application seems to freeze with Python versions before 2.6, most
+likely due to some issue connected with the backported version of the
+``multiprocessing`` module. Assistance in finding a resolution would be
+greatly appreciated.
+
+Installing ``zodbshootout`` using Buildout
+------------------------------------------
+
 First, be sure you have certain packages installed so you can compile
 software. Ubuntu and Debian users should do this (tested with Ubuntu
 8.04, Ubuntu 9.10, Debian Etch, and Debian Lenny)::
@@ -88,10 +96,12 @@
 configuration file. The configuration file contains a list of databases
 to test, in ZConfig format. The script deletes all data from each of
 the databases, then writes and reads the databases while taking
-measurements. Finally, the script produces a tabular summary of
-objects written or read per second in each configuration.
+measurements. Finally, the script produces a tabular summary of objects
+written or read per second in each configuration. ``zodbshootout`` uses
+the names of the databases defined in the configuration file as the
+table column names.
 
-**Repeated Warning**: ``zodbshootout`` deletes all data from all
+**Warning**: Again, ``zodbshootout`` **deletes all data** from all
 databases specified in the configuration file. Do not configure it to
 open production databases!
 
@@ -113,7 +123,7 @@
 * ``-p`` (``--profile``) enables the Python profiler while running the
   tests and outputs a profile for each test in the specified directory.
   Note that the profiler typically reduces the database speed by a lot.
-  This option is intended to help developers discover performance
+  This option is intended to help developers isolate performance
   bottlenecks.
 
 You should write a configuration file that models your intended
@@ -128,32 +138,46 @@
 ``etc/sample.conf`` on a dual core, 2.1 GHz laptop::
 
     "Transaction",                postgresql, mysql,   mysql_mc, zeo_fs
-    "Write 1000 Objects",               6346,    9441,     8229,    4965
-    "Read 1000 Warm Objects",           5091,    6224,    21603,    1950
-    "Read 1000 Cold Objects",           5030,   10867,     5224,    1932
-    "Read 1000 Hot Objects",           36440,   38322,    38197,   38166
-    "Read 1000 Steamin' Objects",    4773034, 3909861,  3490163, 4254936
+    "Add 1000 Objects",                 6529,   10027,     9248,    5212
+    "Update 1000 Objects",              6754,    9012,     8064,    4393
+    "Read 1000 Warm Objects",           4969,    6147,    21683,    1960
+    "Read 1000 Cold Objects",           5041,   10554,     5095,    1920
+    "Read 1000 Hot Objects",           38132,   37286,    37826,   37723
+    "Read 1000 Steamin' Objects",    4591465, 4366792,  3339414, 4534382
 
-``zodbshootout`` runs five kinds of tests for each database. For each
+``zodbshootout`` runs six kinds of tests for each database. For each
 test, ``zodbshootout`` instructs all processes to perform similar
 transactions concurrently, computes the average duration of the
 concurrent transactions, takes the fastest timing of three test runs,
 and derives how many objects per second the database is capable of
 writing or reading under the given conditions.
 
-* Write objects
+``zodbshootout`` runs these tests:
 
+* Add objects
+
     ``zodbshootout`` begins a transaction, adds the specified number of
     persistent objects to a ``PersistentMapping``, and commits the
-    transaction. In the sample output above, MySQL was able to write
-    9441 objects per second to the database, almost twice as fast as
-    ZEO. With memcached support enabled, write performance took a small
-    hit due to the time spent storing objects in memcached.
+    transaction. In the sample output above, MySQL was able to add
+    10027 objects per second to the database, almost twice as fast as
+    ZEO, which was limited to 5212 objects per second. Also, with
+    memcached support enabled, MySQL write performance took a small hit
+    due to the time spent storing objects in memcached.
 
+* Update objects
+
+    In the same process, without clearing any caches, ``zodbshootout``
+    makes a simple change to each of the objects just added and commits
+    the transaction.  The sample output above shows that MySQL and ZEO
+    typically take a little longer to update objects than to add new
+    objects, while PostgreSQL is faster at updating objects in this case.
+    The sample tests only history-preserving databases; you may see
+    different results with history-free databases.
+
 * Read warm objects
 
     In a different process, without clearing any caches,
-    ``zodbshootout`` reads all of the objects just written. This test
+    ``zodbshootout`` reads all of the objects just added. This test
     favors databases that use either a persistent cache or a cache
     shared by multiple processes (such as memcached). In the sample
     output above, this test with MySQL and memcached runs more than ten
@@ -166,32 +190,25 @@
     In the same process as was used for reading warm objects,
     ``zodbshootout`` clears all ZODB caches (the pickle cache, the ZEO
     cache, and/or memcached) then reads all of the objects written by
-    the write test. This test favors databases that read objects
+    the update test. This test favors databases that read objects
     quickly, independently of caching. In the sample output above,
     MySQL cheats a little because it uses a query cache.
 
 * Read hot objects
 
     In the same process as was used for reading cold objects,
-    ``zodbshootout`` clears only the in-memory ZODB caches (the pickle
-    cache) then reads all of the objects written by the write test.
-    This test favors databases that have a process-specific cache. In
-    the sample output above, all of the databases have that type of
-    cache.
+    ``zodbshootout`` clears the in-memory ZODB caches (the pickle
+    cache), but leaves the other caches intact, then reads all of the
+    objects written by the update test. This test favors databases that
+    have a process-specific cache. In the sample output above, all of
+    the databases have that type of cache.
 
 * Read steamin' objects
 
     In the same process as was used for reading hot objects,
     ``zodbshootout`` once again reads all of the objects written by the
-    write test. This test favors databases that take advantage of the
+    update test. This test favors databases that take advantage of the
     ZODB pickle cache. As can be seen from the sample output above,
-    accessing an object from the ZODB pickle cache is much faster than
-    any operation that requires network access or unpickling.
-
-Known Issues
-------------
-
-This application seems to freeze with Python versions before 2.6, most
-likely due to some issue connected with the backported version of the
-``multiprocessing`` module. Assistance in finding a resolution would be
-greatly appreciated.
+    accessing an object from the ZODB pickle cache is around 100
+    times faster than any operation that requires network access or
+    unpickling.

Modified: zodbshootout/trunk/etc/sample.conf
===================================================================
--- zodbshootout/trunk/etc/sample.conf	2009-11-17 20:21:25 UTC (rev 105767)
+++ zodbshootout/trunk/etc/sample.conf	2009-11-17 20:21:28 UTC (rev 105768)
@@ -1,6 +1,8 @@
 
-# This configuration compares the performance of local databases
-# based on PostgreSQL, MySQL (with and without memcached), and ZEO.
+# This configuration compares the performance of local databases based
+# on PostgreSQL, MySQL (with and without memcached), and ZEO. It only
+# compares history-preserving storages; history-free storages are
+# typically faster.
 
 %import relstorage
 

Modified: zodbshootout/trunk/src/zodbshootout/main.py
===================================================================
--- zodbshootout/trunk/src/zodbshootout/main.py	2009-11-17 20:21:25 UTC (rev 105767)
+++ zodbshootout/trunk/src/zodbshootout/main.py	2009-11-17 20:21:28 UTC (rev 105768)
@@ -85,7 +85,7 @@
     def write_test(self, db_factory, n, sync):
         db = db_factory()
 
-        def do_write():
+        def do_add():
             start = time.time()
             conn = db.open()
             root = conn.root()
@@ -98,11 +98,25 @@
 
         db.open().close()
         sync()
-        t = self._execute(do_write, 'write', n)
+        add_time = self._execute(do_add, 'add', n)
 
+        def do_update():
+            start = time.time()
+            conn = db.open()
+            root = conn.root()
+            for obj in conn.root()['speedtest'][n].itervalues():
+                obj.attr = 1
+            transaction.commit()
+            conn.close()
+            end = time.time()
+            return end - start
+
+        sync()
+        update_time = self._execute(do_update, 'update', n)
+
         time.sleep(.1)
         db.close()
-        return t
+        return add_time, update_time
 
     def read_test(self, db_factory, n, sync):
         db = db_factory()
@@ -188,12 +202,17 @@
         r = range(self.concurrency)
         write_times = distribute(write, r)
         read_times = distribute(read, r)
+
+        add_times = [t[0] for t in write_times]
+        update_times = [t[1] for t in write_times]
         warm_times = [t[0] for t in read_times]
         cold_times = [t[1] for t in read_times]
         hot_times = [t[2] for t in read_times]
         steamin_times = [t[3] for t in read_times]
+
         return (
-            sum(write_times) / self.concurrency,
+            sum(add_times) / self.concurrency,
+            sum(update_times) / self.concurrency,
             sum(warm_times) / self.concurrency,
             sum(cold_times) / self.concurrency,
             sum(hot_times) / self.concurrency,
@@ -265,12 +284,21 @@
     config, handler = ZConfig.loadConfig(schema, conf_fn)
     contenders = [(db.name, db) for db in config.databases]
 
+    txn_descs = (
+        "Add %d Objects",
+        "Update %d Objects",
+        "Read %d Warm Objects",
+        "Read %d Cold Objects",
+        "Read %d Hot Objects",
+        "Read %d Steamin' Objects",
+        )
+
     # results: {(objects_per_txn, concurrency, contender, phase): [time]}}
     results = {}
     for objects_per_txn in object_counts:
         for concurrency in concurrency_levels:
             for contender_name, db in contenders:
-                for phase in range(5):
+                for phase in range(len(txn_descs)):
                     key = (objects_per_txn, concurrency,
                             contender_name, phase)
                     results[key] = []
@@ -301,11 +329,12 @@
                             else:
                                 break
                         msg = (
-                            'write %6.4fs, warm %6.4fs, cold %6.4fs, '
+                            'add %6.4fs, update %6.4fs, '
+                            'warm %6.4fs, cold %6.4fs, '
                             'hot %6.4fs, steamin %6.4fs'
                             % times)
                         print >> sys.stderr, msg
-                        for i in range(5):
+                        for i in range(6):
                             results[key + (i,)].append(times[i])
 
     # The finally clause causes test results to print even if the tests
@@ -318,14 +347,6 @@
             'Results show objects written or read per second. '
             'Best of 3.')
 
-        txn_descs = (
-            "Write %d Objects",
-            "Read %d Warm Objects",
-            "Read %d Cold Objects",
-            "Read %d Hot Objects",
-            "Read %d Steamin' Objects",
-            )
-
         for concurrency in concurrency_levels:
             print
             print '** concurrency=%d **' % concurrency
@@ -336,7 +357,7 @@
                 row.append(contender_name)
             rows.append(row)
 
-            for phase in range(5):
+            for phase in range(len(txn_descs)):
                 for objects_per_txn in object_counts:
                     desc = txn_descs[phase] % objects_per_txn
                     if objects_per_txn == 1: