[ZODB-Dev] RelStorage: ConflictError causes deadlock without RELSTORAGE_ABORT_EARLY

Sean Upton sdupton at gmail.com
Sun Jul 20 01:22:11 CEST 2014


Folks,

I have been dealing with locking issues and RelStorage for the past
few days, and want to verify what I believe is a bug:  without
RELSTORAGE_ABORT_EARLY set in environment, tpc_vote() could
potentially leave an ILocker adapter setting an RDBMS table lock
(originally set in either tpc_begin() or in _prepare_tid()) that does
not get removed.

The PostgreSQLLocker.hold_commit_lock() will get called successfully
on one thread/transaction (that likely suffers a ConflictError
sometime after tpc_begin()), but all other threads committing will get
stuck on statement execution [1], up to the point where (in my case)
all Zope2 threads on all instances are stuck there.

While I have implemented commit-lock-timeout for PostgreSQL 9.3+ in a
fork of RelStorage [2], this does not alleviate the issue of a
ConflictError in tpc_vote() neglecting to remove the RDBMS lock
(unless tpc_abort() is called on exception by tpc_vote()).

In other words, I think (unless I misunderstand) is a tpc_vote()
scenario in which ILocker adapter release_commit_locker() methods are
never called [3].

Right now I am working around this by setting RELSTORAGE_ABORT_EARLY
-- but I was under the impression that this is only intended for
testing purposes.

Thoughts?

Sean

[1] Call stack via SIGUSR1 to an instance: http://pastie.org/9400704

[2] https://github.com/upiq/relstorage/commit/6ac7bf31ce3491ff87f5c138c892c0c0906c12ac

[3] https://github.com/zodb/relstorage/blob/master/relstorage/storage.py#L837


More information about the ZODB-Dev mailing list