[Zope-Checkins] CVS: ZODB3/ZEO - stats.py:1.17.6.1.18.2

Tim Peters cvs-admin at zope.org
Mon Dec 1 20:50:31 EST 2003


Update of /cvs-repository/ZODB3/ZEO
In directory cvs.zope.org:/tmp/cvs-serv5103/ZEO

Modified Files:
      Tag: Zope-2_6-branch
	stats.py 
Log Message:
Changed the report of temporal distance to group by the base-2 log of the
distance.  I find this much easier to comprehend.  Something looks fishy,
though.  For the zope.org-main trace, this produces:

Histogram of temporal distance
Dist = floor(log2(distance))
Total oids with 1 load: 20,689
Total oids with 2 or more loads: 66,890
 Dist   Count Percent
    0   47041   4.32
    1    2638   0.24
    2    4196   0.39
    3    6340   0.58
    4   12070   1.11
    5   13633   1.25
    6   14457   1.33
    7   11150   1.02
    8   13920   1.28
    9   17328   1.59
   10   33934   3.11
   11   48471   4.45
   12   88699   8.14
   13   92673   8.50
   14  116006  10.64
   15  339572  31.16
   16  133733  12.27
   17   63074   5.79
   18   24369   2.24
   19    6471   0.59
   20      48   0.00

So the mode is really a temporal distance > 32K.  The dist=0 line is
baffling.  This corresponds to true distances of exactly 0 and exactly 1.
I don't understand how a true distance of 0 is possible, but it does
occur (I had to stick in a hack to prevent log(distance) from blowing
up when distance==0).  And I don't understand how a true distance of 1
can happen except once in a blue moon (this means the client requested
the same object in two adjacent loads, right?  then why wasn't the
second request satisfied from the client's in-memory cache?).  So I'm not
sure we're computing temporal distance in the sense that I understand it.


=== ZODB3/ZEO/stats.py 1.17.6.1.18.1 => 1.17.6.1.18.2 ===
--- ZODB3/ZEO/stats.py:1.17.6.1.18.1	Wed Nov 26 12:49:08 2003
+++ ZODB3/ZEO/stats.py	Mon Dec  1 20:50:30 2003
@@ -287,44 +287,36 @@
 
     # Compute temporal distance and display histogram
     if print_distance:
+        from math import log
+        LOG2 = log(2)
         dist = {}
         repeats = 0
         rest = 0
-        for oid, L in refs.items():
+        for oid, L in refs.iteritems():
             if len(L) < 2:
                 rest += 1
                 continue
             repeats += 1
-            distances = [v - L[i-1] for i, v in enumerate(L) if i >= 1]
-            for d in distances:
-                dist[d] = dist.get(d, 0) + 1
+            for i in xrange(1, len(L)):
+                distance = L[i] - L[i-1]
+                # XXX How can we get a distance of 0?  We do.
+                lg_distance = int(log(distance or 1) / LOG2)
+                dist[lg_distance] = dist.get(lg_distance, 0) + 1
 
-        binsize = 100
-        bins = dict.fromkeys(range(0, max(dist), binsize), 0)
-        for d, count in dist.items():
-            bins[d / binsize * binsize] += count
-        L = bins.items()
+        L = dist.items()
         L.sort()
-            
+
         print
         print "Histogram of temporal distance"
+        print "Dist = floor(log2(distance))"
         all = sum(dist.values())
         print "Total oids with 1 load: %s" % addcommas(rest)
         print "Total oids with 2 or more loads: %s" % addcommas(repeats)
         headers = ["Dist", "Count", "Percent"]
-        fmt = "%5s %7s %3s"
+        fmt = "%5s %7s %6s"
         print fmt % tuple(headers)
-        dots = False
-        for i, (dist, count) in enumerate(L):
-            if not count and dots:
-                continue
-            if not (count or L[i+1][1]):
-                if not dots:
-                    print "..."
-                dots = True
-                continue
-            dots = False
-            print fmt % (dist, count, 100 * count / all)
+        for dist, count in L:
+            print fmt % (dist, count, '%6.2f' % (100.0 * count / all))
 
 def dumpbysize(bysize, how, how2):
     print




More information about the Zope-Checkins mailing list