[ZODB-Dev] ZODB memory problems

Tue May 31 16:16:48 EDT 2005

Tim Peters wrote:
> [Jeremy Hylton]
> 
>>>It's really too bad that ZEO only allows a single outstanding request.
>>> Restructuring the protocol to allow multiple simulatenous requests
>>>was on the task list years ago, but the protocol implementation is so
>>>complex I doubt it will get done :-(.  I can't help but think building
>>>on top of an existing message/RPC layer would be profitable.  (What's
>>>twisted's RPC layer?)  Or at least something less difficult to use than
>>>asyncore.
> 
> 
> [Shane Hathaway]
> 
>>Do you think the RPC layer is the source of the problem?
> 
> 
> Probably depends on what "the problem" refers to?  If the protocol allows
> for at most one outstanding request, then that's clearly _a_ bottleneck,
> right?

Yes.  I meant to ask whether the RPC layer is currently the worst
bottleneck.  Lately I've been dealing with problems that require a
minimum of 50% utilization of a gigabit network connection, and the 3.5
MB/s figure Andreas quoted made me cringe. ;-)

> I get the impression that Jim thinks the ZEO protocol is simple.  I don't
> know -- because I haven't had to "fix bugs" in it recently, I know little
> about it.  It sure isn't obvious from staring at 8000+ lines of ZEO code,
> and Jeremy, Guido and I spent weeks a few years ago "fixing bugs" then.  I
> felt pretty lost the whole time, never sure how many threads there were,
> which code those threads may be executing, how exactly asyncore cooperated
> (or fought) with the threads, or even clear on which RPC calls were
> synchronous and which async.  There's so much machinery of various kinds
> that it's hard to divine the _intent_ of it all.  I remember that sometimes
> the letter "x" gets sent to a socket, and that's important <wink>.

Yes, it is mysterious.  Poor readability seems to be common with code
that deals with both events (i.e. asyncore events) and many threads.  I
used to mix them at a whim.  Lately I've used both events and threads,
but not in the same program, and I think it's done a lot of good for the
maintainability of what I've written.

> It was my ignorant impression at the time that asyncore didn't make much
> sense here, because mixing threads with asyncore is always a nightmare in
> reality, and a ZEO server doesn't intend to service hundreds of clients
> simultaneously regardless.

Having stared at ZEO for a while, I've convinced myself that the ZEO
client code has no reason to use asyncore.  A blocking socket and
makefile() seem like a much better fit.

I'm not sure whether the ZEO server should be event driven or threaded,
but being both is probably wrong.  Since it's event driven now, the ZEO
server may be less susceptible to concurrency gremlins than it would be
with threads.  However, last time I looked, the ZEO server uses a few
threads for miscellaneous work.

>>I feel like the way ZODB uses references in pickles is the main thing
>>that slows it down.  Even if you have a protocol that can request many
>>objects at once, the unpickling machinery only asks for one at a time.
> 
> 
> I'm unclear on what "the unpickling machinery" means.  The most obvious
> meaning is cPickle, but that doesn't ask for anything.  In general, no
> object's state gets unpickled before something in the application _needs_
> its state, so unpickling is currently driven by the application.  Maybe
> you're suggesting some form of "eager" unpickling/state-materialization?

Yes.  For each object, ZODB could store a list of referenced OIDs.  When
ZODB is about to unpickle an object, it could read the list of
referenced OIDs and tell its storage that it will need the pickles for
those objects very shortly (except the objects already loaded.)  Then
the ZEO client code could make a single request for all of the
referenced objects that aren't already in the cache.

Shane