[Zope-dev] RE: [ZODB-Dev] [Warning] Zope/ZEO clients: subprocesses can lead tonon-deterministic message loss

sathya sathya at zeomega.com
Sat Jun 26 14:09:22 EDT 2004


Tim Peters wrote:
hello tim,
so can we safely assume that zeo does not mix the asyncore 
implementation with  forks or threads and hence does not suffer from the 
"child  concurrently operating on sockets along with parent" syndrome 
that dieter is experiencing ?
appreciate any clarifications.
Regards
sathya
> [Dieter Maurer]
> 
>>ATTENTION: Crosspost -- Reply-To set to 'zope-dev at zope.org'
> 
> 
> Which I've honored.
> 
> 
>>Today, I hit a nasty error.
>>
>>The error affects applications under Unix (and maybe Windows) which
>>
>>  *  use an "asyncore" mainloop thread (and maybe other asyncore
>>     applications)
>>
>>     Zope and many ZEO clients belong to this class
> 
> 
> Note a possible complication:  ZEO monkey-patches asyncore, replacing its
> loop() function with one of its own.  This is done in ZODB's
> ThreadedAsync/LoopCallback.py.
> 
> 
>>and
>>
>>  *  create subprocesses (via "fork" and "system", "popen" or friends if
>>     they use "fork" internally (they do under Unix but I think not
>>     under Windows)).
> 
> 
> It may be an issue under Cygwin, but not under native Windows, which
> supports no way to clone a process; file descriptors may get inherited by
> child processes on Windows, but no code runs by magic.
> 
> 
>>The error can cause non-deterministic loss of messages (HTTP requests,
>>ZEO server responses, ...) destined for the parent process. It also can
>>cause the same output to be send several times over sockets.
>>
>>The error is explained as follows:
>>
>>  "asyncore" maintains a map from file descriptors to handlers.
>>  The "asyncore" main loop waits for any file descriptor to
>>  become "active" and then calls the corresponding handler.
> 
> 
> There's a key related point, though:  asyncore.loop() terminates if it sees
> that the map has become empty.  This appears to have consequences for the
> correctness of workarounds.  For example, this is Python's current asyncore
> loop (the monkey-patched one ZEO installs is similar in this respect):
> 
> def loop(timeout=30.0, use_poll=False, map=None):
>     if map is None:
>         map = socket_map
> 
>     if use_poll and hasattr(select, 'poll'):
>         poll_fun = poll2
>     else:
>         poll_fun = poll
> 
>     while map:
>         poll_fun(timeout, map)
> 
> If map becomes empty, loop() exits.
> 
> 
> 
>>  When a process forks the complete state, including file descriptors,
>>  threads and memory state is copied and the new process
>>  executes in this copied state.
>>  We now have 2 "asyncore" threads waiting for the same events.
> 
> 
> Sam Rushing created asyncore as an alternative to threaded approaches;
> mixing asyncore with threads is a nightmare; throwing forks into the pot too
> is a good working definition of hell <wink>.
> 
> 
>>  File descriptors are shared between parent and child.
>>  When the child reads from a file descriptor from its parent,
>>  it steals the corresponding message: the message will
>>  not reach the parent.
>>
>>  While file descriptors are shared, memory state is separate.
>>  Therefore, pending writes can be performed by both
>>  parent and child -- leading to duplicate writes to the same
>>  file descriptor.
>>
>>
>>A workaround it to deactivate "asyncore" before forking (or "system",
>>"popen", ...) and reactivate it afterwards: as exemplified in the
>>following code:
>>
>>     from asyncore import socket_map
>>     saved_socket_map = socket_map.copy()
>>     socket_map.clear() # deactivate "asyncore"
> 
> 
> As noted above, this may (or may not) cause asyncore.loop() to plain stop,
> in parent and/or in child process.  If there aren't multiple threads, it's
> safe, but presumably you have multiple threads in mind, in which case
> behavior seems unpredictable (will the parent process's thread running
> asyncore.loop() notice that the map has become empty before the code below
> populates the map again?  asyncore.loop() will or won't stop in the parent
> depending on that timing accident).
> 
> 
>>     pid = None
>>     try:
>>         pid = fork()
>>	 if (pid == 0):
>>	     # child
>>	     # ...
>>     finally:
>>         if pid != 0:
>>	     socket_map.update(saved_socket_map) # reactivate "asyncore"
> 
> 
> Another approach I've seen is to skip mucking with socket_map directly, and
> call asyncore.close_all() first thing in the child process.  Of course
> that's vulnerable to vagaries of thread scheduling too, if asyncore is
> running in a thread other than the one doing the fork() call.
> 
> _______________________________________________
> Zope-Dev maillist  -  Zope-Dev at zope.org
> http://mail.zope.org/mailman/listinfo/zope-dev
> **  No cross posts or HTML encoding!  **
> (Related lists - 
>  http://mail.zope.org/mailman/listinfo/zope-announce
>  http://mail.zope.org/mailman/listinfo/zope )





More information about the Zope-Dev mailing list