[ZODB-Dev] RelStorage now in Subversion

Dieter Maurer dieter at handshake.de
Fri Feb 1 13:12:22 EST 2008


Hallo Shane,

Shane Hathaway wrote at 2008-1-31 13:45 -0700:
> ...
>No, RelStorage doesn't work like that either.  RelStorage opens a second
>database connection when it needs to store data.  The store connection
>will commit at the right time, regardless of the polling strategy.  The
>load connection is already left open between connections; I'm only
>talking about allowing the load connection to keep an idle transaction.
> I see nothing wrong with that, other than being a little surprising.

That looks very troubesome.

Unless, you begin a new transaction on your load connection after
the write connection was committed,
your load connection will not see the data written over
your write connection.

>>  and you read older and older data
>> which must increase serializability problems
>
>I'm not sure what you're concerned about here.  If a storage instance
>hasn't polled in a while, it should poll before loading anything.

Even if it has polled not too far in the past, it should
repoll when the storage is joined to a Zope request processing
(in "Connection._setDB"):
If it does not, then it may start work with an already outdated
state -- which can have adverse effects when the request bases modifications
on this outdated state.
If everything works fine, than a "ConflictError" results later
during the commit.

This implies, the read connection must start a new transaction
at least after a "ConflictError" has occured. Otherwise, the
"ConflictError" cannot go away.

> ....
>> (Postgres might
>> not garantee serializability even when the so called isolation
>> level is chosen; in this case, you may not see the problems
>> directly but nevertheless they are there).
>
>If that is true then RelStorage on PostgreSQL is already a failed
>proposition.  If PostgreSQL ever breaks consistency by exposing later
>updates to a load connection, even in the serializable isolation mode,
>ZODB will lose consistency.  However, I think that fear is unfounded.
>If PostgreSQL were a less stable database then I would be more concerned.

I do not expect that Postgres will expose later updates to the load
connection.

What I fear is described by the following szenario:

   You start a transaction on your load connection "L".
   "L" will see the world as it has been at the start of this transaction.

   Another transaction "M" modifies object "o".

   "L" reads "o", "o" is modified and committed.
   As "L" has used "o"'s state before "M"'s modification,
   the commit will try to write stale data.
   Hopefully, something lets the commit fail -- otherwise,
   we have lost a modification.

If something causes a commit failure, then the probability of such
failures increases with the outdatedness of "L"'s reads.

> ...
>RelStorage only uses the serializable isolation level for loading, not
>for storing.  A big commit lock prevents database-level conflicts while
>storing.  RelStorage performs ZODB-level conflict resolution, but only
>while the commit lock is held, so I don't yet see any opportunity for
>consistency to be broken.  (Now I imagine you'll complain the commit
>lock prevents scaling, but it uses the same design as ZEO, and that
>seems to scale fine.)

Side note:

  We currently face problems with ZEO's commit lock: we have 24 clients
  that produce about 10 transactions per seconds. We observe
  occational commit contentions in the duration of a few minutes.

  We already have found several things that contribute to this problem --
  slow operations on clients while the commit lock is held on ZEO:
  Python garbage collections, invalidation processing, stupid
  application code.
  But there are still some mysteries and we do not yet have
  a good solution.

> ....
I noticed another potential problem:

  When more than a single storage is involved, transactional
  consistency between these storages requires a true two phase
  commit.

  Only recently, Postgres has started support for two phase commits ("2PC") but
  as far as I know Python access libraries do not yet support the
  extended API (a few days ago, there has been a discussion on
  "db-sig at python.org" about a DB-API extension for two phase commit).

  Unless, you use your own binding to Postgres 2PC API, "RelStorage"
  seems only safe for single storage use.


-- 
Dieter


More information about the ZODB-Dev mailing list