[Zope3-dev] ZEO test failing under Cygwin

Tim Peters tim@zope.com
Wed, 14 May 2003 18:12:37 -0400


[Godefroid Chapelle, running on Cygwin,

     python test.py -vvLd test_config testZEOStorage

fails every time]

> I get
> Running unit tests at level 1
> Running unit tests from /cygdrive/c/Zope3Head/Zope3
> testZEOStorage (zodb.storage.tests.test_config.StorageTestCase) ...
> WARNING:ZCS.
> 2268:ClientStorage (pid=2268) created RW/normal for storage: '1'
> WARNING:ZEC.1:ClientCache: storage='1', size=20971520; file[0]=None
> INFO:root:zrpc:2268: CM.connect(): starting ConnectThread
> INFO:root:zrpc:2268: CT: attempting to connect on 1 sockets
> INFO:root:zrpc:2268: CW: attempt to connect to ('www.python.org', 9001)
> INFO:root:zrpc:2268: CW: connect_ex(('www.python.org', 9001)) returned
>     EINPROGRESS
> INFO:root:zrpc:2268: CT: select() 0, 1, 1
> INFO:root:zrpc:2268: CT: closing troubled socket ('www.python.org', 9001)
> Exception in thread Connect([(2, ('www.python.org', 9001))]):
> Traceback (most recent call last):
>    File "/tmp/python.676/usr/lib/python2.2/threading.py", line 408, in
> __bootstrap
>      self.run()
>    File "/cygdrive/c/Zope3Head/Zope3/src/zodb/zeo/zrpc/client.py", line
> 286, in run
>      success = self.try_connecting(attempt_timeout)
>    File "/cygdrive/c/Zope3Head/Zope3/src/zodb/zeo/zrpc/client.py", line
> 314, in try_connecting
>      r = self._connect_wrappers(wrappers, deadline)
>    File "/cygdrive/c/Zope3Head/Zope3/src/zodb/zeo/zrpc/client.py", line
> 382, in _connect_wrappers
>      del wrappers[wrap]
> KeyError: <zodb.zeo.zrpc.client.ConnectWrapper instance at 0xa329448>
>
> Killed



> and yes the socket appear in both w and x sets.
>
> I am afraid I am not of a big help to solve this further...
> Please let me know what else I can do.

Thanks!  You're being very helpful.

Next step:  unfortunately, we have to deal with the exact details of how
sockets behave, but those details vary a lot across platforms, and are
almost never documented in sufficient correct detail to get right in
advance.  So this is a poke-and-hope adventure.

In this test we're trying a non-blocking connect on a socket, and we know
the connection attempt must fail (eventually), and we have to make sense out
of what the platform select() returns in this case.  You could stare at the
Cygwin docs, but I doubt they'll answer the question.

When _connect_wrappers() does

                r, w, x = select.select([], connecting, connecting, 1.0)

this socket is in the "connecting" set passed in twice.  The code doesn't
care whether the platform select() says the socket is writable (in the w
return value), or suffered some sort of exception (in the x return value),
or is neither, but we haven't before seen a platform where select() says
it's both (else this test would fail the same way on that platform).

I'm not prepared to insist that's a bug in Cygwin -- socket behavior varies
too much to make such a claim stick.  Instead we need to find some hack
"that works" (although the Cygwin folks would probably consider it a bug if
they didn't act the same way as Linux acts here, even if so we probably
don't want to wait for them to fix it either).

There are two equally simple hacks we can *try* now.  I can't predict what
will happen with either (it depends on exactly what Cygwin sockets do after
we get beyond the immediate failure):

1. Resolve ambiguity by favoring x.  Insert this code after the select():

                for wrap in x:
                    if wrap in w:
                        w.remove(wrap)

2. Resolve ambiguity by favoring w.  Insert this code after the select():

                for wrap in w:
                    if wrap in x:
                        x.remove(wrap)

Either/neither/both of those may allow this test to pass on Cygwin.  All
results are interesting.  If one doesn't work, the test may fail, or hang,
or go into an infinite loop, or do something else never seen before <wink>.

So let's try both ways in this test and just see what happens.  If one of
them appears to work, we can move on to the other tests.  If both or neither
appear to work, more thought is required.