[ZODB-Dev] Tracking down a freeze (deadlock?)

Fri Feb 25 10:01:24 EST 2005

[Florent Guillaume]
>>   File "/opt/zope/lib/python/ZODB/Connection.py", line 257, in _setDB
>>     self._flush_invalidations()
>>   File "/opt/zope/lib/python/ZODB/Connection.py", line 552, in
>> _flush_invalidations
>>     self._cache.invalidate(self._invalidated)
>>   File
>>
"/appli/zeo/zeocli-192.168.106.6-8080/Products/DICOD/DICODMailingList.py",
line 125, in __del__
>>   File "/opt/zope/lib/python/ZODB/Connection.py", line 599, in setstate
>>     invalid = self._is_invalidated(obj)
>>   File "/opt/zope/lib/python/ZODB/Connection.py", line 617, in
_is_invalidated
>>     self._inv_lock.acquire()
>>
>> Hm I think I can answer that one. A persistent object is not supposed to
>> have a __del__ that accesses the ZODB right ? Otherwise, well, we see
>> what happens.

[Dieter Maurer]
> On the other hand, it should not cause a deadlock.

Why is that?  As far as I'm concerned, ZODB doesn't support persistent
objects with __del__ methods -- it was never intended to.  "Don't do that"
rules then.  In the case of the infinite loop in the memory cache I
mentioned before, there was insignificant runtime expense to avoid the
infinite loop due to the __del__ method (although the collective brainpower
expended on finding and testing that fix was obscene relative to the actual
benefit).

> It would not when "_inv_lock" were a reentrant lock. I think, it could be
> (as "acquire" and "release" are not called in different threads).

I agree it could be, but:

1. A reentrant lock is significantly more expensive to use, both
   to acquire and to release.

2. Avoiding the deadlock is only the tip of the iceberg.  Locks are
   meant to ensure invariants, and the latter are rarely documented
   in ZODB.  To ensure that invariants aren't violated when switching
   to a reentrant lock requires trying to figure out what all the
   intended invariants actually are, then checking every possibly
   relevant code path to ensure that those invariants remain satisfied
   in reentrant cases.

So, ya, it's a matter of seconds to change from a Lock to an RLock here, and
a matter of perhaps 20 minutes to write and debug a good new test case(s),
but every ZODB user would pay runtime expense for that forever after, and it
may or may not introduce subtle new bugs (depending on the analysis in #2,
and also on whether that analysis is wholly correct).  If the benefit is to
stop a deadlock in application code Florent is inclined to believe is
incorrect anyway, it sounds like a losing set of tradeoffs to me.