[ZODB-Dev] BTree corrupted after conflict resolution

Tim Peters tim at zope.com
Sat Mar 6 16:28:52 EST 2004


[Christian Robottom Reis]
> For the record, I was driving towards a suggestion that if it's a
> problem with evil edge-cases that are bound to bite late in the life
> of an application, OxBTrees should just forbid adding Persistent
> instances as keys (instead of raising errors because of __cmp__ being
> called during conflict resolution). This would however (and perhaps
> unfairly) hurt people that wanted to use Persistent instances as keys
> in a completely non-concurrent application, which is why I added the
> "or" part of my phrase.
>
> But what zero-concurrency application stays that way for long?

We've also seen peculiar claims here that "someone" is using persistent
objects as BTree keys already in ZODB apps.  If that's true, it would also
break their code (although it's hard for me to conceive of a sane sense in
which their code isn't already broken).

> ...
> It's still not clear to me how this works, however: let's say a BTree
> holding non-persistent instances conflicts; it will invoke its
> conflict resolution handler and attempt to sort out how to solve the
> conflict by looking at its keys and comparing them to see which order
> they should be placed in (I see in bucket_merge it's more involved
> than this, but hope it's a reasonable high-abstraction description).

Yes, and when the state of a persistent object contains non-persistent
subobjects, the latter objects are fully materialized in the states passed
to conflict resolution, so it's possible to compare those objects in the way
their type(s) intend.

> What I fail to see is why __cmp__ will be invoked (to determine key
> order) for a non-Persistent instance, but not for a Persistent
> instance.
>
> Is it because PyObject_Compare only invokes __cmp__ when the instance
> has state (IOW, isn't a ghost)?

Conflict resolution doesn't see ghosts, and __cmp__ is always invoked.  But
the *persistent* subobjects in the state of a persistent object aren't
materialized:  everywhere there's a persistent subobject, the state(s) seen
by conflict resolution contain an instance of the Python class
PersistentReference instead.  That class is defined in
ZODB/ConflictResolution.py (at least on the HEAD).  It doesn't explicitly
define a __cmp__ method today, so comparisons of PersistentReferences fall
back to the default compare-memory-address behavior used by all Python
classes that don't explicitly define comparison behavior.

It so happens that tryToResolveConflict() arranges that the states passed to
_p_resolveConflict preserve persistent object identity for persistent
subobjects (if a persisent subobject with oid 1024 appears 12 times across
the 3 states involved in conflict resolution, a single instance of
PersistentReference will be seen in those 12 places, and nowhere else).  So,
if p1 and p2 are persistent subobjects (whose PersistentReference stubs are)
seen by conflict resolution today, cmp() applied to their
PersistentReference stubs returns 0 today if and only if p1 and p2 have the
same oid.  If they don't have the same oid, cmp() doesn't return 0, but what
it does return depends on the accidental relative memory addresses of their
distinct PersistentReference stubs.

It's unclear to me how much of this behavior is intentional; it's also
unclear how much existing code relies on various details (intentionally or
not).




More information about the ZODB-Dev mailing list