[ZODB-Dev] What makes the ZODB slow?

Tim Peters tim.peters at gmail.com
Tue Jun 27 03:15:04 EDT 2006


[Sidnei da Silva]
> Does that mean that if someone didn't care about older python's in the
> mix and were willing to register those shorter byte code extensions
> for it's own Zope that person would likely see great improvements in
> pickle size reduction, and that it would even improve ZEO transport by
> reducing the size of the data that is transferred?

One question at a time ;-)

Anything reducing average pickle size will make ZEO's job easier, in
at least three ways:  less network traffic when an object pickle needs
to travel in either direction, a ZEO client cache of a given size can
hold more objects the smaller the average pickle, and a ZEO client
cache needs less I/O traffic to load/store "an average" pickle.

Independent of the extension registry, protocol 2 gives major benefits
in pickle size for instances of new-style classes.  That was in fact
the primary goal of protocol 2.

The extension registry is prone to the same kinds of problems you get
from any kind of shared global state, and see the PEP for a discussion
of those.  In particular, if you write a pickle using it, you can no
longer load that pickle unless _executable code_ first registers
exactly the same module.class strings with exactly the same codes.
Pickles are no longer self-contained.  This is obviously brittle, so
think about that.  The hope at the time was that Zope (& others) would
register extension codes in the reserved ranges with the PSF, so that
they could be built into Python distributions.  That didn't happen
yet.

As to "great improvements", beats me.  The improvement would be
somewhere from 1.19 to 32.89, where I'm not going to say anything
about what those measure :-)

> Maybe we should propose such changes to Zope 2.10/2.11 since it
> already requires Python 2.4 (since Zope 2.9).

Switching to protocol 2 (with or without trying to use the extension
registry) in ZODB is an unknown (to me) amount of work.  A "pure"
pickle user shouldn't have any problems, but ZODB "cheats" in places,
picking apart pieces of pickle strings itself because it thinks it
knows everything there is to know about what's inside a pickle.  The
new protocol 2 opcodes may confuse various of those places.  So I'd
expect some "weird problems" at first, but not many.  Definitely not
something that should be done after a beta 1 -- that's alpha-level
insecurity.

Note also that because nobody has used the extension-registry
machinery for real, it may  be broken in dumb ways Python's test suite
doesn't cover.  I know people are using protocol 2 with new-style
classes, so that's more trustworthy, but Zope will probably _stress_
it in ways nobody else gets close to, so I wouldn't be surprised to
see a few Python bugs there too.

If you want to play, change:

        self._p = cPickle.Pickler(self._file, 1)

to:

        self._p = cPickle.Pickler(self._file, 2)

in ObjectWriter.__init__ (serialize.py).  There are a bunch of other
places a Pickler gets created, but I expect that's the most important
one.

Don't forget to try utilities (fsrecover, fstest, fsdump, ...) too.


More information about the ZODB-Dev mailing list