[ZODB-Dev] session problems

Florent Guillaume fg at nuxeo.com
Fri Dec 23 20:21:41 EST 2005


I've been debugging session problems for two days, I feel it's time  
to write down what I've observed and ask for other eyes to look at it  
(Chris McDonough has been working on this too). This is all on Zope  
2.9 trunk BTW (ZODB 3.6.0b5 and Zope 2.9's tempstorage) with python  
2.4.2.

What I observed was an unnatural number of repeated ConflictError (by  
that, I mean "write" conflicts) followed by more and more  
ReadConflictErrors as soon as you go beyond the time  
CONFLICT_CACHE_MAXAGE of TemporaryStorage.

To simplify debugging, I've boosted that constant and I only debug  
the write conflict errors.

The first write conflict happens when a BTree can't resolve a  
conflict. The transaction is then aborted.

Here, it should happen what happens correctly for FileStorage, the  
connections' _flush_invalidations should get called and it shoud  
reset the _txn_time of the connection to None so that the modified  
oids (including the BTree's), when invalidated, reset the _txn_time  
to their serial. So that on the next conflict, _setstate_noncurrent  
calls loadBefore with that serial.

But apparently the _flush_invalidations() of the connection is never  
called. So _txn_time is never bumped into the future (and in turn,  
means the next write conflict will try to load exactly the same  
serials as before and fail again, etc.) .

This seems to happen because:

1. the connection has _synch to True: it has registered itself has a  
synchronizer, and expects its afterCompletion to be called when  
(among others) the transaction is aborted, and the afterCompletion is  
calling _flush_invalidations,

2. the synchronizer (the connection itself) has been lost from the  
transaction's _serializers WeakSet for some reason (garbage collected  
I guess). It was there in earlier transactions, but it's not there at  
the time it's needed.

If someone can make sense of this...

Actually I don't know why the connection (=synchronizer) could be  
gone from the transaction's _sychronizers WeakSet but still be in the  
DB's connection pool WeakSet. I guess here lies the problem.

Also, I don't know why we don't observe this for FileStorage, maybe  
something has a hard reference on it somewhere?

Florent

-- 
Florent Guillaume, Nuxeo (Paris, France)   Director of R&D
+33 1 40 33 71 59   http://nuxeo.com   fg at nuxeo.com





More information about the ZODB-Dev mailing list