[ZODB-Dev] BTree set question

Fri Feb 27 00:34:31 EST 2004

[Chris McDonough]
> ...
> I still don't quite understand why two transactions can't
> delete the same key without generating a conflict, but maybe I just
> lack imagination. ;-)

I'm pretty sure Jim's mental model was based on what CVS says "is a
conflict" when doing 3-way diffs during branch merges.  If, since the branch
point, the same line of code is deleted on both head and branch, that does
(IIRC) get tagged as a conflict, just because "the same line of code" was
changed *somehow* on both.

But if that is the model, then changing the same line of code in the same
way via, e.g., appending the same comment in both copies of line, is also "a
conflict", and the bucket resolution code doesn't call the analogous bucket
change a conflict.  So maybe that one's just a bug wrt the original
intent -- in retrospect, we can only guess.

If you want a contrived example for why deleting the same key can be
dangerous, suppose you have a set of accounts so delinquent that you're
willing to put a contract out on them, to serve as a fatal warning example
to others.  So a transaction consists of:

- Pick a member of the very-delinquent-account set.

- Delete it.

- Fire off a fax to the local mob, establishing the contract.

You never want to do that twice on the same account, because if the mob gets
two orders for a contract, they're not going to give the money for one of
them back (I've seen enough cheesy movies to know that for sure <wink>).

[John Belmonte]
>> However, it would be nice if we could write ZODB apps a little
>> naively (or without having enough time to attend to all details on
>> the first pass), and as the conflict errors crop up, design
>> solutions for them.  This notion can be supported by having
>> conservative BTree merge rules.  Code that is bad for ZODB, but
>> otherwise perfectly "normal" if there wasn't concurrency, should
>> generate write conflicts if reasonably possible.

I don't know how to draw the line.  The only surprise-free strategy is to
act always and in all respects as if transactions were fully serialized
(done in total "one at a time").  There have been previous long discussions
about which "isolation levels" ZODB may or may not support, which is a
family of formal ways for specifying exactly how surprising things may get.
Sticking a name on an isolation level isn't really a help, though, except
for those immersed enough in the topic to know exactly what those names mean
(and the meanings are technical and subtle).

[Chris]
> ...
> I think the aggressiveness of the conflict resolution code can be
> traced back to the fact that, for Zope, an unresolveable
> ConflictError is extremely expensive.

That's both a great point, and a puzzle:  the BTree conflict resolution code
simply isn't notably aggressive.  That's why you're wondering, e.g., why it
gripes about transactions deleting the same key.  For Zope's purposes (and,
I expect, for *most* purposes), that's probably no problem at all.  So I'm
wondering more why it's so conservative!

> ...
> It would be nice to be able to plug in a different policy, but I'm
> pretty sure I'd just make a complete hash of things if I tried to make
> that possible.

Someday we'll help <wink>:  the contract (pre- and post-conditions) for
writing conflict-resolution methods isn't really written down anywhere
either.  It's a todo list item to document that "someday".

> And I'm glad Tim stared at it long enough to make sense out of the
> bucket_merge code;

I had to, it turned out.  This part of the comment:

    However, it's not OK for s2 and s3 to, between them, end up
    deleting all the keys.  This is a higher-level constraint, due
    to that the caller of bucket_merge() doesn't have enough info
    to unlink the resulting empty bucket from its BTree correctly.

didn't describe what the original code did.  It didn't special-case this,
and that turned out to be the cause for segfaults in a BTree stress-test I
wrote.  I spent many happy days staring at BTree conflict resolution as a
result <wink>.

> I wouldn't want to further punish him by asking him to change it. ;-)

I don't mind changing it if there's a real reason.  But that code *is*
delicate, and at the core of so much of what Zope does, so I don't take any
changes to it lightly.

> I'm also happy to work around the current set of issues (now that I
> know them!) because I think it likely helps Zope performance in highly
> concurrent situations.

Have you tried ZODB 3.3a2 yet?  While write conflicts can kill performance
under high load, I wonder whether read conflicts don't actually cause more
pain in the world:  lots of people seem to try to ignore them (whether by
design or accident I'm not sure), and get into real semantic trouble as a
result.  MVCC is a pretty aggressive approach toward that class of problem
(you avoid read conflicts at the cost of seeing more "stale" data -- but,
again, for most apps, and Zope in particular, stale data isn't really a
problem, just so long as it's consistent with how the world really was at
*some* time in the recent past; but *some* apps can't tolerate that).