[ZODB-Dev] ZEO client hangs when combined with other asyncore code

Tim Peters tim at zope.com
Tue Jun 21 19:04:29 EDT 2005


[Tim Peters]
>> asyncore gives me a headache.

[Paul Boots]
> Same here

Then it's time to admit that ZEO's attempts to mix threads with asyncore
give me migraine headaches <0.5 wink>.

>> I wonder whether this could be the problem:  Paul said he's calling
>> ZEO "from within the proxy code", but it sounds like the proxy code
>> itself runs "as a side effect" of asyncore callbacks.  If the flow is
>> like this:
>>   asyncore mainloop invokes POP3 proxy code
>>       POP3 proxy code makes a synchronous ZEO call
>> then I figure the app may well hang then:  the thread running the
>> asyncore mainloop is still running a POP3 proxy callback, waiting for a
>> response that can never happen until the asyncore mainloop gets control
>> back (in order to send & receive ZEO messages).

> I think that's exactly how the Proxy runs, we use asynchat and the
> 'line_terminator' to trigger a callback, so it appears the code runs
> 'magically' at first glance.

I never used asynchat (& ZEO doesn't either), so can't guess whether it's
contributing "new" complications.  ZEO's control flow is murky to me too.  I
_think_ (but may well be wrong) that ZEO expects asyncore to be running in a
different thread than the thread(s) application code using ZEO clients
is(are) running in.  Maybe someone who understands this better than I will
jump in with a revelation.

>> IOW, if Paul added print statements to ZODB's ZEO/zrpc/smac.py's
>> SizedMessageAsyncConnection readable() and writable() methods, I bet
>> they never trigger when the app appears to be hung (which would mean
>> that the thread running asyncore's mainloop is in fact not getting a
>> chance to run the asyncore loop anymore).

> You're right - I added the suggested comments as first line in the
> readable() and writable() methods they never appear.

The asyncore loop calls readable() and writable() on every object registered
with asyncore, each time around the asyncore loop.  So if those aren't
getting called, the asyncore loop isn't running -- or it is running but the
timeout on asyncore's select.select() call is so large that you didn't wait
long enough to get output (I think that one's unlikely, but ...).

BTW, something that might help get more clues:  ZEO does a nasty thing to
asyncore.  In ZEO's ThreadedAsync/LoopCallback.py, it reaches into Python's
asyncore module and _replaces_ asyncore.loop with its own loop function.
That shouldn't change the functionality of asyncore, but it means that if
you, e.g., put print statements or debugger breakpoints in Python's asyncore
loop, they'll never trigger.  If you're working at that level, you need to
put them in LoopCallback.py's functions instead.

> Could I do synchronous calls to the ZEO server?  An other option to
> bypass the problem is to use Zope/XMLRPC to do what we want, I assume
> that will not suffer they same problem.
> Your opinion would be much appreciated,

*Someone's* might be -- like maybe Dieter's <wink>.  I'm sorry, but I don't
understand your application well enough to suggest something useful.  I'm
not familiar with Zope/XMLRPC either.  For that matter, I don't really
understand why your app is hanging now, although I seemed to get lucky with
at least part of my guess last time.  The only vague idea I have is along
the lines of spinning off another thread to talk with ZEO, and have the POP3
proxy code queue up work requests for "the ZEO thread" to process (e.g., via
an instance of Python's Queue.Queue, which is designed for this purpose).
That's based on the guess that there's no problem with the POP3 proxy and
ZEO just "sharing" asyncore, the problem is in trying to invoke ZEO _from_
an asyncore callback.

IMO/IME, asyncore is a poor fit for applications where the callbacks are
"fancy", or even where they may just take a long time to complete (because
the asyncore mainloop is unresponsive for the duration).  So if I had to use
asyncore (I've never done so on my own initiative <wink>), I'd gravitate
toward a work-queue model anyway, where threads unfettered by asyncore
worries do all "the real work"-- especially on Windows, which loves to run
threads --and where asyncore callbacks do as little as possible.

More information about the ZODB-Dev mailing list