[ZODB-Dev] Another interesting ZODB cache inconsistency

Fri Jan 13 17:16:39 EST 2006

Is the problem with consistency of results served across the ZEO
clients or by consistency of the database itself?  It seems like it
must be the former.

In the case of an intolerable ZEO failure, I would expect to lose
execution time consistency among peers but preserve consistency of
committed state.  ZEO can't really provide consistency across the
clients anyway, since one client could be executing before a
particular transaction commits and another after it commits.  If two
web clients talk to the two different ZEO clients, they'll see
different results.  A big transaction exacerbates the problem, because
its takes longer to do everything (including the underlying commit on
the storage).

A few thoughts about the effects:

- Each client should process all of the invalidations from a
transaction or none.  If a client loses contact with the server while
invalidations are being sent, it should not process any of them. 
Maybe there's a bug in the code here?  I haven't looked at the code
lately.

- If a client is disconnected, regardless of the state it was in with
respect to this one transaction, it should revalidate its cache and
invalidate and stale data that it held as a result of the disconnect.

Jeremy

On 1/13/06, Dieter Maurer <dieter at handshake.de> wrote:
> We recently observed another ZODB cache inconsistency:
>
>   The commit of a huge transaction caused our ZEO server to be late
>   in responding to the HA monitoring probe. The HA monitor responded
>   with a SIGTERM targeted to the ZEO server. ZEO restarted.
>
>   The ZEO client performing the huge transaction reported an
>   error in the second phase of its commit state.
>
>   The ZODB states of other ZEO clients were inconsitent:
>   some of them had received invalidation messages and saw
>   the objects modified by the huge transaction with their new
>   values. Others had not yet received the invalidation messages
>   and treated the objects as still unchanged.
>
>
> This means that interrupting ZEO while it is sending invalidation messages
> can cause inconsitent states in the ZODB caches of its clients.
>
> I do not know what can be done about it...
>
>
> --
> Dieter
> _______________________________________________
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
>
> ZODB-Dev mailing list  -  ZODB-Dev at zope.org
> http://mail.zope.org/mailman/listinfo/zodb-dev
>