[ZODB-Dev] RE: ZEO server hang

Ehle, Brandon BEhle at ea.com
Thu Aug 28 11:21:36 EDT 2003


>From the lack of response, does that mean that nobody knows why the ZEO
server is hanging or am on the wrong mailing list?
 
>From looking into the code, it looks like the server is trying to start
another transaction, but a different transaction is still holding a
lock.  This could possibly be true if self.locked is cleared without
self.timeout.end(self) being called.  The only way that I can see that
this will happen is a race condition between clearing self.locked, but
before self.timeout.end(self). As you can see, the message about the
Transaction Lock being released happens right immediately after the
assertion saying that the TimeoutThread hasn't been unlocked yet.  If
this is the case, I think if we change the code for _clear_transaction()
such that the self.locked = 0 happens after the self.timeout.end(self)
call it should fix this case.
 
Any thoughts?
 
 
-----Original Message-----
From: Ehle, Brandon 
Sent: Thursday, August 21, 2003 3:25 PM
To: 'zodb-dev at zope.org'
Subject: ZEO server hang 
 
I have an application that uses ZEO to synchronize some worker processes
across a small cluster and I've been running a fairly frequent hang
since I scaled up the number of machines in the farm to 22.
 
Basically I have a master process that delegates task and a bunch of
slave processes that get tasks from the master, do some processing and
then report their results back in the database.
 
The hang occurs whether my database is running on RedHat 9.0 or Windows
2000.
 
I'm running Python 2.2 and ZODB3.2b2.  All previous versions of ZODB3.2
and ZODB3.1 exhibited the hang as well, so I'm currently using
ZODB3.2b2.
 
Here's the output I get from the ZEO server.  After the last message
pops up the process is stuck pretty hard and the only way do bring it
down is to use a -KILL signal.
 
Anyone idea how to get around or fix this?
 
 
2003-08-21T15:06:55 INFO(0) ZSS:10608/10.14.241.86:2039 Transaction
blocked waiting for storage. Clients waiting: 1.
------
2003-08-21T15:06:55 INFO(0) ZSS:10608/10.14.243.169:2495 Transaction
blocked waiting for storage. Clients waiting: 2.
------
2003-08-21T15:06:55 INFO(0) ZSS:10608/10.14.243.134:4793 Transaction
blocked waiting for storage. Clients waiting: 3.
------
2003-08-21T15:06:55 INFO(0) ZSS:10608/10.14.243.36:2607 Transaction
blocked waiting for storage. Clients waiting: 4.
------
2003-08-21T15:06:55 INFO(0) ZSS:10608/10.14.241.88:1365 Transaction
blocked waiting for storage. Clients waiting: 5.
------
2003-08-21T15:06:55 INFO(0) ZSS:10608/10.14.243.171:1226 Transaction
blocked waiting for storage. Clients waiting: 6.
------
2003-08-21T15:06:55 INFO(0) ZSS:10608/10.14.243.94:1600 Transaction
blocked waiting for storage. Clients waiting: 7.
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.151:1748 Transaction
released storage lock
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.151:1748 Blocked
transaction restarted.  Clients waiting: 6
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.241.95:4529 Transaction
blocked waiting for storage. Clients waiting: 7.
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.241.220:3768 Transaction
blocked waiting for storage. Clients waiting: 8.
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.116:4149 Transaction
blocked waiting for storage. Clients waiting: 9.
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.241.86:2039 Transaction
released storage lock
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.241.86:2039 Blocked
transaction restarted.  Clients waiting: 8
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.241.82:2016 Transaction
blocked waiting for storage. Clients waiting: 9.
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.241.89:2884 Transaction
blocked waiting for storage. Clients waiting: 10.
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.241.86:2039 Transaction
blocked waiting for storage. Clients waiting: 11.
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.169:2495 Transaction
released storage lock
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.169:2495 Blocked
transaction restarted.  Clients waiting: 10
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.169:2495 Transaction
blocked waiting for storage. Clients waiting: 11.
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.151:1748 Transaction
blocked waiting for storage. Clients waiting: 12.
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.134:4793 Transaction
released storage lock
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.134:4793 Blocked
transaction restarted.  Clients waiting: 11
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.134:4793 Transaction
blocked waiting for storage. Clients waiting: 12.
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.36:2607 Transaction
released storage lock
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.36:2607 Blocked
transaction restarted.  Clients waiting: 11
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.36:2607 Transaction
blocked waiting for storage. Clients waiting: 12.
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.241.88:1365 Transaction
released storage lock
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.241.88:1365 Blocked
transaction restarted.  Clients waiting: 11
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.241.88:1365 Transaction
blocked waiting for storage. Clients waiting: 12.
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.171:1226 Transaction
released storage lock
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.171:1226 Blocked
transaction restarted.  Clients waiting: 11
------
2003-08-21T15:06:56 INFO(0) ZSS:10608/10.14.243.171:1226 Transaction
blocked waiting for storage. Clients waiting: 12.
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.243.94:1600 Transaction
released storage lock
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.243.94:1600 Blocked
transaction restarted.  Clients waiting: 11
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.243.94:1600 Transaction
blocked waiting for storage. Clients waiting: 12.
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.241.95:4529 Transaction
released storage lock
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.241.95:4529 Blocked
transaction restarted.  Clients waiting: 11
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.241.220:3768 Transaction
released storage lock
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.241.220:3768 Blocked
transaction restarted.  Clients waiting: 10
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.241.95:4529 Transaction
blocked waiting for storage. Clients waiting: 11.
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.241.220:3768 Transaction
blocked waiting for storage. Clients waiting: 12.
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.243.157:3249 Transaction
blocked waiting for storage. Clients waiting: 13.
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.243.82:3336 Transaction
blocked waiting for storage. Clients waiting: 14.
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.243.116:4149 Transaction
released storage lock
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.243.116:4149 Blocked
transaction restarted.  Clients waiting: 13
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.243.116:4149 Transaction
blocked waiting for storage. Clients waiting: 14.
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.241.82:2016 Transaction
released storage lock
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.241.82:2016 Blocked
transaction restarted.  Clients waiting: 13
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.241.82:2016 Transaction
blocked waiting for storage. Clients waiting: 14.
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.243.133:2631 Transaction
blocked waiting for storage. Clients waiting: 15.
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.241.89:2884 Transaction
released storage lock
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.241.89:2884 Blocked
transaction restarted.  Clients waiting: 14
------
2003-08-21T15:06:57 INFO(0) ZSS:10608/10.14.241.89:2884 Transaction
blocked waiting for storage. Clients waiting: 15.
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.241.86:2039 Transaction
released storage lock
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.241.86:2039 Blocked
transaction restarted.  Clients waiting: 14
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.241.86:2039 Transaction
blocked waiting for storage. Clients waiting: 15.
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.243.169:2495 Transaction
released storage lock
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.243.169:2495 Blocked
transaction restarted.  Clients waiting: 14
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.243.151:1748 Transaction
released storage lock
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.243.151:1748 Blocked
transaction restarted.  Clients waiting: 13
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.243.169:2495 Transaction
blocked waiting for storage. Clients waiting: 14.
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.243.151:1748 Transaction
blocked waiting for storage. Clients waiting: 15.
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.243.134:4793 Transaction
released storage lock
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.243.134:4793 Blocked
transaction restarted.  Clients waiting: 14
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.243.36:2607 Transaction
released storage lock
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.243.36:2607 Blocked
transaction restarted.  Clients waiting: 13
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.241.88:1365 Transaction
released storage lock
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.241.88:1365 Blocked
transaction restarted.  Clients waiting: 12
------
2003-08-21T15:06:58 PROBLEM(100) ZSS:10608/10.14.243.81:1606 Unexpected
error handling waiting transaction
Traceback (most recent call last):
  File
"/work/ZODB3-3.2b2/build/lib.linux-i686-2.2/ZEO/StorageServer.py", line
585, in _restart_other
    zeo_storage._restart(delay)
  File
"/work/ZODB3-3.2b2/build/lib.linux-i686-2.2/ZEO/StorageServer.py", line
557, in _restart
    self._tpc_begin(self.transaction, self.tid, self.status)
  File
"/work/ZODB3-3.2b2/build/lib.linux-i686-2.2/ZEO/StorageServer.py", line
469, in _tpc_begin
    self.timeout.begin(self)
  File
"/work/ZODB3-3.2b2/build/lib.linux-i686-2.2/ZEO/StorageServer.py", line
901, in begin
    assert self._client is None
AssertionError
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.243.94:1600 disconnected
during transaction Transaction user=''
------
2003-08-21T15:06:58 INFO(0) ZSS:10608/10.14.243.94:1600 Transaction
released storage lock
------
2003-08-21T15:07:28 INFO(0) ZSS:10608/10.14.241.95:4529 Transaction
timeout after 30.0 seconds
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.zope.org/pipermail/zodb-dev/attachments/20030828/03956965/attachment-0001.htm


More information about the ZODB-Dev mailing list