[Zope3-dev] race condition in Python2.3 asyncore
Steve Alexander
steve@cat-box.net
Wed, 23 Oct 2002 15:46:53 +0000
I've been having fun dubugging a problem with Zope3 unit tests running
on Python2.3 from CVS on intel Linux 2.4 kernel.
In this email, I explain the problem, suggest a change to the unit tests
to prevent the problem holding up running the tests, and describe a fix
to asyncore to fix the underlying problem.
The problem is with the FTPServer tests. Generally, the tests will pass
and usually an error in a non-main thread will be printed to the console.
Traceback (most recent call last):
File "/usr/local/lib/python2.3/threading.py", line 410, in __bootstrap
self.run()
File "/usr/local/lib/python2.3/threading.py", line 398, in run
apply(self.__target, self.__args, self.__kwargs)
File "testFTPServer.py", line 121, in loop
poll(0.1, socket_map)
File "/usr/local/lib/python2.3/asyncore.py", line 108, in poll
r, w, e = select.select(r, w, e, timeout)
error: (9, 'Bad file descriptor')
(the line numbers are out a bit, as I've inserted a bunch of debugging
probes)
Sometimes, the test will not complete, and will hang after printing this
error to the console.
After much debugging, I've found a fix, to make sure the test does not
hang, and also the cause of the problem.
The fix to ensure the test does not hang is to alter the loop() method
from testFTPServer from this:
def loop(self):
while self.run_loop:
self.counter = self.counter + 1
# print " 'loop', self.counter
poll(0.1, socket_map)
to this:
def loop(self):
import select
from errno import EBADF
while self.run_loop:
self.counter = self.counter + 1
# print " 'loop', self.counter
try:
poll(0.1, socket_map)
except select.error, data:
if data[0] == EBADF:
print "exception:", data
else:
raise
The cause of the problem seems to be a race condition around the
socket_map global of asyncore.
If a socket is closed during the time between the socket_map being
processed into arguments for select, and the select.select call, the
call will raise a 'Bad file descriptor' error.
This error is only occuring in the testFTPServer test testPASS, and is
related to this code in the FTPServerChannel class:
def cmd_pass(self, args):
'See Zope.Server.FTP.IFTPCommandHandler.IFTPCommandHandler'
self.authenticated = 0
password = args
credentials = UsernamePassword(self.username, password)
try:
self.server.fs_access.authenticate(credentials)
except Unauthorized:
self.reply('LOGIN_MISMATCH')
self.close_when_done()
else:
self.credentials = credentials
self.authenticated = 1
self.reply('LOGIN_SUCCESS')
The problem occurs when there is an Unauthorized error, and
close_when_done() closes the socket.
I suggest that asyncore get patched to catch EBADF errors, and at that
point see if the file descriptors in the current version of socket_map
are the same as they were at the start of the method. If they are the
same, re-raise, otherwise, ignore.
--
Steve Alexander