[ZODB-Dev] Latest profile results

Greg Ward gward@mems-exchange.org
Mon, 16 Dec 2002 14:27:54 -0500


Hi all --

I've been doing some more profiling of ZODB FileStorage open/close
today.  Since I was playing with this stuff last week, a few things have
changed:

  * I cvs up'd on the ZODB 3.1 branch and on the trunk

  * I upgraded from a slightly-patched Linux 2.4.20-rc1 kernel
    to a slightly-more-patched 2.4.20 (specifically, I'm now
    using Robert Love's preemptible kernel patch)

  * I rearranged my hard disk partitions so the swap space is now
    the first partition rather than the second (the ZODB database is
    still on the last [outermost] partition though)

  * I hacked up my profiling script so that "open/close without index
    file" and "open/close with index file" are now done in separate
    runs

and, most importantly,

  * I added 512 MB of RAM to my PC, for a total of 768 MB (wow, that's
    more than 1200x as much as what Bill Gates once said "should be
    enough memory for anyone")

In short, opening a FileStorage with an existing and up-to-date index
file got a *lot* faster:

open...                   7.1 elapsed      6.6 cpu
close...                  2.7 elapsed      2.4 cpu

(again, those are eyeball averages of three runs).

Last week, open-with-index was taking around 24 seconds (22 sec
user-space CPU) with ZODB 3.1.  Jeremy, did you change anything?  Or can
this tremendous speedup be accounted for by my memory upgrade?

I don't really understand how -- last week I had 256 MB, which was
plenty for the ~30 MB process size that resulted from just opening this
database.  I also did the whole cycle many many times in succession last
week, so anything else that didn't fit in physical RAM should have been
swapped out by the 2nd or 3rd run.  I wasn't experiencing any noticeable
swapping just from opening the database; it was the big
walk-the-whole-database operations that sucked up 100-200 MB of RAM and
really hurt.  I'm sure the kernel's much larger disk buffers help, but I
doubt they help that much -- after all, the amount of CPU time consumed
was also reduced by a factor of 3.  (I'm quoting user-space CPU time,
not kernel-space.)

Wild theories, anyone?

Anyways, the open-without-index case is still pretty slow -- around 31
sec elapsed, 28 sec CPU.  So I did by-line profiling with hotshot.  No
big surprises -- most of the time is taken up by the innermost loop in
read_index().  Here are the results; first column is number of hits for
that line, second is number of microseconds consumed by that line:

  32416      78441 /tmp/zodb/3.1br/ZODB/FileStorage.py:2032, in read_index()
  32415    1190145 /tmp/zodb/3.1br/ZODB/FileStorage.py:2034, in read_index()
  32416    1043244 /tmp/zodb/3.1br/ZODB/FileStorage.py:2035, in read_index()
  32414      77371 /tmp/zodb/3.1br/ZODB/FileStorage.py:2036, in read_index()
  32414     488529 /tmp/zodb/3.1br/ZODB/FileStorage.py:2043, in read_index()
  32414      46337 /tmp/zodb/3.1br/ZODB/FileStorage.py:2044, in read_index()
  32414      59651 /tmp/zodb/3.1br/ZODB/FileStorage.py:2046, in read_index()
  32414      44806 /tmp/zodb/3.1br/ZODB/FileStorage.py:2048, in read_index()
  32414     235931 /tmp/zodb/3.1br/ZODB/FileStorage.py:2052, in read_index()
  32414      61046 /tmp/zodb/3.1br/ZODB/FileStorage.py:2062, in read_index()
  32414     161951 /tmp/zodb/3.1br/ZODB/FileStorage.py:2065, in read_index()
  32414      53376 /tmp/zodb/3.1br/ZODB/FileStorage.py:2088, in read_index()
  32414      46404 /tmp/zodb/3.1br/ZODB/FileStorage.py:2090, in read_index()
  32414      90172 /tmp/zodb/3.1br/ZODB/FileStorage.py:2091, in read_index()
  32414      64639 /tmp/zodb/3.1br/ZODB/FileStorage.py:2093, in read_index()
  32414     136559 /tmp/zodb/3.1br/ZODB/FileStorage.py:2104, in read_index()
 415566     648063 /tmp/zodb/3.1br/ZODB/FileStorage.py:2105, in read_index()
 350738    5397076 /tmp/zodb/3.1br/ZODB/FileStorage.py:2108, in read_index()
 350738    3193149 /tmp/zodb/3.1br/ZODB/FileStorage.py:2109, in read_index()
 350738   12146180 /tmp/zodb/3.1br/ZODB/FileStorage.py:2110, in read_index()
 350738    3647359 /tmp/zodb/3.1br/ZODB/FileStorage.py:2115, in read_index()
 350738    1055504 /tmp/zodb/3.1br/ZODB/FileStorage.py:2116, in read_index()
 350738    1370279 /tmp/zodb/3.1br/ZODB/FileStorage.py:2118, in read_index()
 350738    3494494 /tmp/zodb/3.1br/ZODB/FileStorage.py:2129, in read_index()
 350738    2805943 /tmp/zodb/3.1br/ZODB/FileStorage.py:2134, in read_index()
 350738     819731 /tmp/zodb/3.1br/ZODB/FileStorage.py:2141, in read_index()
  32414      45977 /tmp/zodb/3.1br/ZODB/FileStorage.py:2143, in read_index()
  32414     221068 /tmp/zodb/3.1br/ZODB/FileStorage.py:2148, in read_index()
  32414     183349 /tmp/zodb/3.1br/ZODB/FileStorage.py:2149, in read_index()
  32414      58034 /tmp/zodb/3.1br/ZODB/FileStorage.py:2150, in read_index()
  32414     139661 /tmp/zodb/3.1br/ZODB/FileStorage.py:2154, in read_index()
  32414    1052882 /tmp/zodb/3.1br/ZODB/FileStorage.py:2156, in read_index()
  32414    2584853 /tmp/zodb/3.1br/ZODB/FileStorage.py:2157, in read_index()
  32414     218514 /tmp/zodb/3.1br/ZODB/FileStorage.py:2158, in read_index()
  32414    1703439 /tmp/zodb/3.1br/ZODB/FileStorage.py:2159, in read_index()
  32414     186190 /tmp/zodb/3.1br/ZODB/FileStorage.py:2160, in read_index()
3253886   11935864 /tmp/zodb/3.1br/ZODB/utils.py:56, in U64()
1084629    1895656 /tmp/zodb/3.1br/ZODB/utils.py:57, in U64()
1084629   19208257 /tmp/zodb/3.1br/ZODB/utils.py:58, in U64()
1084629    3349564 /tmp/zodb/3.1br/ZODB/utils.py:59, in U64()

(No other lines of code took more than 200 usec.)

BTW, If it looks to you like those numbers add up to a lot more than 28
sec CPU, you're right.  Part of that is because the 28 sec I quoted
above was just timing with getrusage() -- no profiling.  By-line
profiling increased runtime by about 50%.  But there's also something
funny in hotshot's numbers that I haven't tracked down yet.  Anyways, as
long as the proportions are correct, that's what really matters.

The big winner (19.2 sec) is utils.py:58:

        h, v = unpack(">II", v)

Runner-up at 12.1 sec is FileStorage.py:2110:

            oid,serial,sprev,stloc,vlen,splen = unpack(DATA_HDR, h)

Interesting.  When was the last time anyone sat down and seriously
optimized the struct module?

        Greg
-- 
Greg Ward - software developer                gward@mems-exchange.org
MEMS Exchange                            http://www.mems-exchange.org