[ZODB-Dev] write performance

Nicholas Henke henken@seas.upenn.edu
Mon, 3 Feb 2003 18:35:08 +0000


Hey guys --
	I have a ZODB app that I am running with default cachesizes, etc using
FileStorge, python2.2.2, and ZODB-3.1 on RedHat 7.2.

I have a set of applications that are running our cluster management
suite, and am trying to get more performance out of it. FYI, I am using
a unix domain socket to get every bit of socket bandwidth I can.

I have enabled the tracing, and am collecting a tracefile from each of
the clients that talk to the database. There is one client that is a
mostly writing client, it reads in data from a source, and then sets a
ton of attributes on objects. The other client reads this data and dumps
is out to a scheduler. I have a few questions:

1) Does DirectoryStorage or dbStorage have better write performance than
FileStorage?
2) How does one enable the asyncore support in a ZEO client ? I have
heard that if the cleint uses asyncore in someway that the invailidation
messages get handled better/faster.
3) Are unix domain sockets really faster than normal sockets when
connections are limited to the localhost ?
4) What can I do to increase the write performance ?
5) Would a persistant on disk cache help ?
6) Is there anything screwy with the following stats.py + simul.py data from the 'writing' client ?
[root@testcluster ZEO]# python2.2 stats.py -s -h /tmp/zeo.trace
Feb  3 18:04:05 ==================== Restart ====================
Feb  3 18:04-11          3 loads,          2 hits, 66.7% hit rate
Feb  3 18:11:48 ==================== Restart ====================
Feb  3 18:11-13        237 loads,          2 hits,  0.8% hit rate
Feb  3 18:13:53 ==================== Restart ====================
Feb  3 18:13-14        255 loads,          2 hits,  0.8% hit rate
Feb  3 18:15-20         22 loads,          8 hits, 36.4% hit rate
Feb  3 18:20:43 ==================== Restart ====================
Feb  3 18:20-28        260 loads,          2 hits,  0.8% hit rate

Read 12,522 records (300,528 bytes) in 0.5 seconds
Versions:   0 records used a version
First time: Mon Feb  3 18:04:05 2003
Last time:  Mon Feb  3 18:28:56 2003
Duration:   1,491 seconds
File stats: 12,522 in file 0; 0 in file 1
Data recs:  11,757 (93.9%), average size 0.4 KB
Hit rate:   2.1% (load hits / loads)

        Count Code Function (action)
            4  00  _setup_trace (initialization)
          761  20  load (miss)
           16  2a  load (hit, returning non-version data)
       10,981  3a  update
          760  5a  store (non-version data present)

Histogram of object load frequency
Unique oids: 546
Total loads: 777
loads objects   %obj  %load   %cum
    1     425  77.8%  54.7%  54.7%
    2      26   4.8%   6.7%  61.4%
    3      88  16.1%  34.0%  95.4%
    4       6   1.1%   3.1%  98.5%
   12       1   0.2%   1.5% 100.0%

Histograms of object sizes


Unique sizes written: 7
      size   objs writes
       256   4580   9018
       512     23     61
       768     41     64
     1,024     65   1546
     1,280     45    243
     1,536      5     47
     1,792      1      2

Unique sizes loaded: 5
      size   objs  loads
       256      6      7
       512      1      6
       768      1      1
     1,024      1      1
     1,280      1      1
[root@testcluster ZEO]# python2.2 simul.py /tmp/zeo.trace
ZEOCacheSimulation, cache size 20,000,000 bytes
  START TIME  DURATION    LOADS     HITS INVALS WRITES  FLIPS HITRATE
Feb  3 18:04      7:43        2        2      0   2940      0 100.0%
Feb  3 18:11      2:05      237        2      0   1135      0   0.8%
Feb  3 18:13      6:50      277       10      0   3500      0   3.6%
Feb  3 18:20      8:13      260        2      0   3406      0   0.8%
Feb  3 18:04     24:51      776       16      0  10981      0   2.1% OVERALL

7) What other information can I provide to help in answering my questions ?

Thanks!
Nic
-- 
Nicholas Henke
Linux Cluster Systems Programmer
Liniac Project - Univ. of Pennsylvania