[ZODB-Dev] ZEO Client deadlocking in asyncore.poll - how to I debug

Roché Compaan roche at upfrontsystems.co.za
Mon Apr 7 10:52:36 EDT 2008


Check that your ZEO client cache size is big enough. If your code is
making queries that return more objects than the cache can hold it will
result in a state where the client needs to constantly load objects from
storage server. If you switch on debugging on the ZEO server you should
see what objects are being loaded. 

-- 
Roché Compaan
Upfront Systems                   http://www.upfrontsystems.co.za

On Mon, 2008-04-07 at 16:16 +0200, Anton Stonor wrote:
> We have a setup with a ZEO server and 4 ZEO clients.
> 
> During the last weeks we have seen almost daily deadlocks in some of the 
> ZEO clients. I've tried to wait for up to 30 minutes before restarting a 
> client.
> 
> I could need an advice on how to debug this.
> 
> With DeadlockDebugger I see the same pattern each time:
> 
> One thread is hanging:
> 
> 
>    File "/usr/local/www/zope-2.9.6/lib/python/ZODB/Connection.py", line 
> 732, in setstate
>      self._setstate(obj)
>    File "/usr/local/www/zope-2.9.6/lib/python/ZODB/Connection.py", line 
> 768, in _setstate
>      p, serial = self._storage.load(obj._p_oid, self._version)
>    File "/usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py", 
> line 746, in load
>      return self.loadEx(oid, version)[:2]
>    File "/usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py", 
> line 769, in loadEx
>      data, tid, ver = self._server.loadEx(oid, version)
>    File "/usr/local/www/zope-2.9.6/lib/python/ZEO/ServerStub.py", line 
> 192, in loadEx
>      return self.rpc.call("loadEx", oid, version)
>    File "/usr/local/www/zope-2.9.6/lib/python/ZEO/zrpc/connection.py", 
> line 531, in call
>      r_flags, r_args = self.wait(msgid)
>    File "/usr/local/www/zope-2.9.6/lib/python/ZEO/zrpc/connection.py", 
> line 638, in wait
>      asyncore.poll(delay, self._singleton)
>    File "/usr/local/lib/python2.4/asyncore.py", line 122, in poll
>      r, w, e = select.select(r, w, e, timeout)
> 
> 
> The other threads of the ZEO client are waiting for the hanging thread 
> to release the storage lock so that they can acquire it:
> 
>   File "/usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py", line 
> 760, in loadEx
>      self._load_lock.acquire()
> 
> 
> When I connect to the ZEO server monitor I can see an increasing number 
> of reads (probably from the other ZEO Clients).
> 
> I've set transaction-timeout 15.
> 
> How to I dig further to resolve this?
> 
> zeo.conf partly below:
> 
> --
> <zeo>
>    address 8200
>    read-only false
>    invalidation-queue-size 100
>    # pid-filename $INSTANCE/var/ZEO.pid
>    monitor-address 8201
>    transaction-timeout 15
> </zeo>
> 
> <filestorage 1>
>    path $INSTANCE/var/Data.fs
> </filestorage>
> 
> %import tempstorage
> <temporarystorage temp>
>    name temporary storage for sessioning
> </temporarystorage>
> --
> 
> Anton
> 
> 
> 
> _______________________________________________
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
> 
> ZODB-Dev mailing list  -  ZODB-Dev at zope.org
> http://mail.zope.org/mailman/listinfo/zodb-dev




More information about the ZODB-Dev mailing list