[ZODB-Dev] Race condition in basestorage commit locks?

Christian Theune ct at gocept.com
Wed Oct 5 13:44:16 EDT 2005


Hi Tim,

Am Mittwoch, den 05.10.2005, 13:26 -0400 schrieb Tim Peters:
> I wouldn't call this "a race", because nothing here appears to be
> timing-dependent.  This is the code:

Right. I mixed up terminology. Sorry.

> raises an exception.  The code does seem to implicitly assume that neither
> of those _will_ raise an exception, and I agree the commit lock release
> belongs in the `finally` clause instead.
> 
> I'll change that, but doubt it will make a difference to you:  if either of
> those did raise an exception, I expect you would have seen a traceback.

Hmm. I don't know. At least the lock will have been gone away and the
system would have continued. It's just plainly stuck in that moment.

> This code doesn't _suppress_ exceptions, right?  I'd also look at the source
> code for whatever storages you're using, to see whether/how they override
> _abort() and/or _clear_temp().  If any of those _can_ raise exceptions, then
> (a) they probably ought to be changed so that they cannot; but, (b) that
> indeed could provoke BaseStorage.tpc_abort() into leaving the commit lock
> acquired.

The storage is ape and it simply loops over all kinds of "connections"
that it maintains on it's own (which i suppose to go to postgres in our
case, dunno if ape has anything additional around there). Ape does not
check for any errors in its _abort() though.

> > Additionally, we are using Ape running on a postgres backend, so this
> > might trigger some unusual side effects, maybe this possible race
> > condition.
> 
> It's also possible for storages (deriving from BaseStorage) to do their own
> unique things with the locks BaseStorage created.  That is, _just_ staring
> at BaseStorage doesn't tell us what the other stuff you're using may be
> doing with these locks.  For example, FileStorage mucks with the commit lock
> too, to give packing a way to block commits during the final copy.

Ape doesn't do anything to _commit_lock as far as I can tell.

> > (If someone has another suspicion where this hang might come from, I'm
> > all your's to listen)
> 
> As above, FileStorage can block commits "for a long time" if a pack is going
> on.  That's not really a hang, but can _appear_ to be a hang until packing
> completes.  There's no evidence in your msg that packing was going on,
> though.

Yup. Definitely no pack around. We're using a non-undoable storage for
the write-part of the ZODB anyway.

> Another possibility is that some path thru the code calls tpc_begin on a
> storage but never calls tpc_finish or tpc_abort on that storage.  Then the
> storage's commit lock will remain acquired forever.  For example, did the
> following msg show up in your logs?
> 
>             LOG('ZODB', PANIC,
>                 "A storage error occurred in the last phase of a "
>                 "two-phase commit.  This shouldn\'t happen. "
>                 "The application will not be allowed to commit "
>                 "until the site/storage is reset by a restart. ",
>                 error=sys.exc_info())
> 
> Or this one?
> 
>                 LOG('ZODB', ERROR,
>                     "A storage error occured during object abort. This "
>                     "shouldn't happen. ", error=sys.exc_info())

Actually, I found those in the logs, but they were from two month ago.
Both of them were PANIC messages.

I'll try to dig through Ape tomorrow and will try moving the release in
the final block. Unfortunately I have to rely on a very patient
customer, as we can only provoke this on a live system.

I suspect some of the problems beeing the amount of external systems
involved with a transaction here. At least it's ZODB, PostgreSQL and
MySQL within a single transaction. And Ape. 

Cheers,
Christian

-- 
gocept gmbh & co. kg - schalaunische str. 6 - 06366 koethen - germany
www.gocept.com - ct at gocept.com - phone +49 3496 30 99 112 -
fax +49 3496 30 99 118 - zope and plone consulting and development
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://mail.zope.org/pipermail/zodb-dev/attachments/20051005/845abe01/attachment.bin


More information about the ZODB-Dev mailing list