[ZODB-Dev] ZEO Client deadlocking in asyncore.poll - how to I debug

Anton Stonor anton at headnet.dk
Mon Apr 7 10:16:44 EDT 2008


We have a setup with a ZEO server and 4 ZEO clients.

During the last weeks we have seen almost daily deadlocks in some of the 
ZEO clients. I've tried to wait for up to 30 minutes before restarting a 
client.

I could need an advice on how to debug this.

With DeadlockDebugger I see the same pattern each time:

One thread is hanging:


   File "/usr/local/www/zope-2.9.6/lib/python/ZODB/Connection.py", line 
732, in setstate
     self._setstate(obj)
   File "/usr/local/www/zope-2.9.6/lib/python/ZODB/Connection.py", line 
768, in _setstate
     p, serial = self._storage.load(obj._p_oid, self._version)
   File "/usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py", 
line 746, in load
     return self.loadEx(oid, version)[:2]
   File "/usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py", 
line 769, in loadEx
     data, tid, ver = self._server.loadEx(oid, version)
   File "/usr/local/www/zope-2.9.6/lib/python/ZEO/ServerStub.py", line 
192, in loadEx
     return self.rpc.call("loadEx", oid, version)
   File "/usr/local/www/zope-2.9.6/lib/python/ZEO/zrpc/connection.py", 
line 531, in call
     r_flags, r_args = self.wait(msgid)
   File "/usr/local/www/zope-2.9.6/lib/python/ZEO/zrpc/connection.py", 
line 638, in wait
     asyncore.poll(delay, self._singleton)
   File "/usr/local/lib/python2.4/asyncore.py", line 122, in poll
     r, w, e = select.select(r, w, e, timeout)


The other threads of the ZEO client are waiting for the hanging thread 
to release the storage lock so that they can acquire it:

  File "/usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py", line 
760, in loadEx
     self._load_lock.acquire()


When I connect to the ZEO server monitor I can see an increasing number 
of reads (probably from the other ZEO Clients).

I've set transaction-timeout 15.

How to I dig further to resolve this?

zeo.conf partly below:

--
<zeo>
   address 8200
   read-only false
   invalidation-queue-size 100
   # pid-filename $INSTANCE/var/ZEO.pid
   monitor-address 8201
   transaction-timeout 15
</zeo>

<filestorage 1>
   path $INSTANCE/var/Data.fs
</filestorage>

%import tempstorage
<temporarystorage temp>
   name temporary storage for sessioning
</temporarystorage>
--

Anton





More information about the ZODB-Dev mailing list