[ZODB-Dev] ZEO Client deadlocking in asyncore.poll - how to I debug

Alan Runyan runyaga at runyaga.com
Mon Apr 7 10:36:03 EDT 2008


check out zeo server log files.  a known problem is people using iptables
or some sort of filtering between ZEO clients and ZEO server.  this config
took several hours off my life ;-(

On Mon, Apr 7, 2008 at 9:16 AM, Anton Stonor <anton at headnet.dk> wrote:
> We have a setup with a ZEO server and 4 ZEO clients.
>
> During the last weeks we have seen almost daily deadlocks in some of the ZEO
> clients. I've tried to wait for up to 30 minutes before restarting a client.
>
> I could need an advice on how to debug this.
>
> With DeadlockDebugger I see the same pattern each time:
>
> One thread is hanging:
>
>
>  File "/usr/local/www/zope-2.9.6/lib/python/ZODB/Connection.py", line 732,
> in setstate
>    self._setstate(obj)
>  File "/usr/local/www/zope-2.9.6/lib/python/ZODB/Connection.py", line 768,
> in _setstate
>    p, serial = self._storage.load(obj._p_oid, self._version)
>  File "/usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py", line 746,
> in load
>    return self.loadEx(oid, version)[:2]
>  File "/usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py", line 769,
> in loadEx
>    data, tid, ver = self._server.loadEx(oid, version)
>  File "/usr/local/www/zope-2.9.6/lib/python/ZEO/ServerStub.py", line 192, in
> loadEx
>    return self.rpc.call("loadEx", oid, version)
>  File "/usr/local/www/zope-2.9.6/lib/python/ZEO/zrpc/connection.py", line
> 531, in call
>    r_flags, r_args = self.wait(msgid)
>  File "/usr/local/www/zope-2.9.6/lib/python/ZEO/zrpc/connection.py", line
> 638, in wait
>    asyncore.poll(delay, self._singleton)
>  File "/usr/local/lib/python2.4/asyncore.py", line 122, in poll
>    r, w, e = select.select(r, w, e, timeout)
>
>
> The other threads of the ZEO client are waiting for the hanging thread to
> release the storage lock so that they can acquire it:
>
>  File "/usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py", line 760,
> in loadEx
>    self._load_lock.acquire()
>
>
> When I connect to the ZEO server monitor I can see an increasing number of
> reads (probably from the other ZEO Clients).
>
> I've set transaction-timeout 15.
>
> How to I dig further to resolve this?
>
> zeo.conf partly below:
>
> --
> <zeo>
>  address 8200
>  read-only false
>  invalidation-queue-size 100
>  # pid-filename $INSTANCE/var/ZEO.pid
>  monitor-address 8201
>  transaction-timeout 15
> </zeo>
>
> <filestorage 1>
>  path $INSTANCE/var/Data.fs
> </filestorage>
>
> %import tempstorage
> <temporarystorage temp>
>  name temporary storage for sessioning
> </temporarystorage>
> --
>
> Anton
>
>
>
> _______________________________________________
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
>
> ZODB-Dev mailing list  -  ZODB-Dev at zope.org
> http://mail.zope.org/mailman/listinfo/zodb-dev
>



-- 
Alan Runyan
Enfold Systems, Inc.
http://www.enfoldsystems.com/
phone: +1.713.942.2377x111
fax: +1.832.201.8856


More information about the ZODB-Dev mailing list