[ZODB-Dev] Potential BTrees splitting bug

Tim Peters tim at zope.com
Wed Sep 24 23:18:20 EDT 2003


[Tim]
>> ...
>> The patch may be, or contribute to, the problem.  Could you please
>> without your patch too.

[Dieter Maurer]
> This will be not so easy ...

How so?  You said you were using Zope-2_6-branch plus your patch.  I don't
think I'm asking for more than trying Zope-2_6-branch without your patch.

> ...
> Thought, I already had been clear -- but apparently were wrong:

Details are everything here.

> The check fails in the "BTrees" consistency check, finding
> an unordered key ("key KKK larger than upper bound BBB, index III,
> OOBucket XXX at path PPP).
>
> The index ("III") is always 16, the path ("PPP") is varying but always
> points to a bucket with 17 elements.

OK, I thank you, and that's part of the full story.

>> Did the test produce any output when it failed (and if
>> so, can we see it -- the most recent versions of the tests do an
>> exhaustive dump of the BTree to stdout when they fail, which is
>> most helpful)?

> I use such a version and it dumped indeed the tree.
>
> Unfortunately, it dumped it to "stdout" and not to "stderr".
> Therefore, it did not end up in the ("testrunner") log file.
>
> But, I looked at these dumps (about 40).

So you got the output anyway.  I think.

>   *  Each looked a bit different (what is to be expected)
>
>   *  They looked good apart from the isolated misordering
>      facefully reported by the consistency check.
>
>      There have been some missing elements (not many)
>      almost surely caused ZODB conflicts.

This part is ambiguous.  If you mean that you detected a pattern by eyeball
in the tree, and it appeared to your eyes that some elements in that pattern
were missing, then yes, that could be an expected result of conflicts.
There's another sense you could mean:  one of the specific problems this
test can complain about is a failure in this line:

        self.assertEqual(all_keys, list(tree.keys()))

That can fail when keys are missing for a different reason.  Unfortunately,
Zope-2_6-branch contains the least helpful version of these tests.

What would really be great is if you could trying installing the
just-released ZODB3 3.2b3, and run the tests from that on their own, via

    test.py --all

or even better

    test.py -vvL --all . ConcurrentUpdates

and save stdout.

checkConcurrentUpdates2Storages() will get run there too, but a different
version of the test that does better error reporting in case of multiple
kinds of tree damage.  It will also run other variations of that test, and
seeing which do and don't fail eventually when run in a loop (-L) will yield
more useful clues.  If history is a guide, these will be timing dependent
and very difficult to diagnose.

> ...
> For me, it does not look like a cache bug.
> There should *never* have been a bucket with such a content.

Don't blink:  that's exactly what makes it most likely to be a cache bug.
What has happened in the past is that a bucket splits, but *somebody's*
cache holds on to a stale value for the parent BTree node of the bucket that
split, but gets correct current values for the pair of buckets resulting
from the split.  The parent node is what directs new incoming keys to the
correct bucket, and the stale parent node directs a new incoming key-- one
that *should* have gone into the "rightmost" bucket of the split bucket
pair --into the current leftmost bucket instead.  So instead of the new key
ending up "in the interior" of the rightmost bucket, it's found at the end
of the leftmost bucket instead.  That's what creates an "impossible" bucket.
The BTree code has never been at fault in such cases, and in a strong sense
the BTree code is too simple to fail in such a way (while delicate, it's
relentlessy uniform).

The same kind of thing could happen if a subtle flaw in an MVCC
implementation delivered a stale value for the parent BTree node of the
bucket that split, which is why I'm especially keen to see whether the
problem goes away if that patch goes away.

> A failure looks e.g. like (from memory):

Believe me, it's much more useful to get the actual output than to get an
English summary of what you think you remember about what you thought might
have been interesting about it <wink>.

>   ....
>   PPP: OOBucket with 17 keys (bounds 190, 206 -- taken "OOBTrees"
>        keys) 0: key 190, value 2
>        1: key 191, value 1
>        ...
>        15: key 205, value 1
>        16: key 220, value 2 ---> consistency violated
>
>   PPP+1: OOBucket with ... keys (bounds 206, ...)
>         0: key 206, value 2
>         1: key 207, value 1
> 	....
> 	13: key 219, value 1
> 	14: key 221, value 1
> 	....
>
> There have been cases where the misplaced key should have gone to
> "PPP+2".

Alas, at *this* level of detail it all looks depressingly familiar ("instead
of the new key ending up 'in the interior' of the rightmost bucket, it's
found at the end of the leftmost bucket instead").  More detail would
probably make it even more familiar.  I wonder whether running on a 2-CPU
box introduces a wrinkle we haven't seen before (I've never run the tests on
a 2-headed box -- don't have one)?

> As noted earlier: the misplaced key always has been the last (index
> 16) key in an OOBucket with 17 keys.

That's strong evidence that it's a problem with a cache holding on to a
stale parent BTree node after a vanilla bucket split (or delivering a stale
value  for the parent BTree node for some other reason).

> If helpful, I can force output from stdout and stderr into a single
> log file and post (or send privately to you?) the resulting file.

That would be appreciated.  Running

    test.py -vvL --all . ConcurrentUpdates > someoutputfile 2>&1

until you get several failures would be most helpful.




More information about the ZODB-Dev mailing list