[ZODB-Dev] deadlock prevention for ZODB3 / Zope 2.6

Shane Hathaway shane@ZOPE.COM
Tue, 19 Nov 2002 13:10:13 -0500


Jeremy Hylton wrote:
>>>>>>"JH" == Jeremy Hylton <jeremy@zope.com> writes:
>>>>>
>   JH> can be used by more than one process at a time.  The general
>   JH> rule is that sortKey() must be the same in every process using a
>   JH> storage.  If a storage can't be shared between multiple
>   JH> storages, then it is trivial to satisfy the rule -- __name__ or
>   JH> id() work just fine.
> 
>   JH> (I think this is a clearer explanation that my original
>   JH> message.)
> 
> It might have been clearer if I had proofread it.  How about one more
> try:
> 
>   The general rule is that sortKey() must be the same in every process
>   using a storage.  If a storage can't be shared between multiple
>   processes, then it is trivial to satisfy the rule -- __name__ or id()
>   work just fine.
> 
> Also, it is possible that the locking scheme used by some storages
> that are shared among multiple processes may not required a global
> sortKey.  I'd have to think more about that before I'm certain.

It might be beneficial to explain the deadlock a little better.  Imagine 
client A wants to commit changes to both storage 1 and 2 in a 
transaction and acquires locks in that order.  Client B also wants to 
commit changes to the same storages, but it mistakenly acquires the lock 
for storage 2 before it acquires the lock for storage 1.

This will happily work until both clients try to commit something at the 
same time.  Client A acquires the lock on storage 1, then client B 
acquires the lock on storage 2.  Next, client A wants the lock on 
storage 2.  It can't acquire it because client B holds it, so it waits 
for client B to finish.  Client B wants the lock for storage 1, but it 
can't have it because client A holds it, so it waits for client A to 
finish.  Ever been at a 4-way stop where everyone is afraid to move? 
Locks are really stupid in situations like this. :-)

Now, with the sortKey() method, commit is always done in the same order. 
  It doesn't matter what the order is, only that each client uses the 
same order.  (Abusing the traffic metaphor, we could say that at 4-way 
stops, the choice of who goes first is made by alphabetically sorting 
the license plate numbers.  I wouldn't mind--since mine starts with "A", 
I'd usually get to go first. ;-) )

Until this change, Zope acquired storage locks in a basically random 
order.  In a ZEO cluster, this can not only lock up the two clients that 
try to simultaneously commit, but it also locks up any client that 
attempts to write to either storage server!

Now, if the clients write to something to which they have independent, 
exclusive access, there's no issue.  And there's no problem if they 
never write to more than one storage at a time.  But since this problem 
has never been recognized in ZODB before, it was decided that we need to 
call attention to it.  That's why there's a new log message.

Shane