[ZODB-Dev] Re: [Zope-dev] Conflict reduced BTrees for catalogin

Tim Peters tim@zope.com
Thu, 13 Mar 2003 23:01:02 -0500


[Dieter Maurer]
>> I am working on a CMS on top of ZODB for large amounts of
>> large SGML/XML documents.
>>
>> To speed things up, a colleague started two import processes
>> and we get incredible amounts of "database read conflict error"s
>> from cataloguing (although we already use "QueueCatalog" for
>> most indexes).
>>
>> I think, the data structures used for cataloguing and indexing
>> could have a "def _p_independent(self): return 1".
>>
>> Does anybody objects?
>>
>> If not, I will implement "_p_independent" BTrees and friends.

[Jeremy Hylton]
> Conflict resolution for BTrees is very subtle.  I'd like to see a
> careful analysis of what impact ignoring read conflicts will have.  It's
> definitely impossible for internal nodes, but it might be possible for
> buckets.
>
> Can you prove that ignoring read conflicts is safe?  If you can't, then
> I'd say it isn't worth the risk.

Can we prove that the current BTree conflict resolution schemes are safe?
It depends on what safety means, and I'm not sure.  A minimal definition is
that conflict resolution never creates a BTree that violates the BTree
invariants, but that's wholly implementation-centered and doesn't say much
about application-visible semantics.  I expect that ignoring bucket read
conflicts would preserve that minimal notion of safety.

Current bucket resolution allows some (not all) kinds of simultaneous
mutation of a bucket too, and the notion of safety already built in is hard
to characterize succinctly ("it's like" a CVS merge that looks only at
positions of modified lines and not the contents of the lines -- e.g., it's
OK if two transactions delete different keys from a bucket, but not OK if
they delete the same key).

At this point I want to talk about it in the office <wink>.