[ZODB-Dev] Corrupted zodb fun

Paul Winkler pw_lists@slinkp.com
Tue, 21 Jan 2003 13:40:25 -0800


started on zope@zope.org, Chris Withers suggested bringing
it over here...

I would be most grateful for any suggestions,
and will do my best to provide more details if any of this
is useful data for the zodb-dev crew.

This is with Zope 2.5.1 (source tarball), 
Python 2.1.3 compiled with largefile support
(passes largefile tests), 
various products including CMF HEAD from last November
(basically 1.3 with a new feature I needed).
running linux, a RedHat-provided kernel I think, linux 2.4.7-10 SMP.
Using ZEO and FileStorage.

A "routine" run of fstest.py on my development server reported
a problem.  Thinking I should be proactive and avoid future
nastiness, I cp'ed the Data.fs and ran fsrecover.py on
the copy. Result: *thousands* of errors: 
179 POSKeyErrors (about 80 unique ones IIRC),
and over 10,000 bad transaction lengths.

fsrecover "fixes" all this, but looks like it tossed away 
about 80 MB of data representing 5 days' work for 3 people. 
I want to salvage as much of that data as I possibly can.
I do *not* want to do itamar's truncation technique,
I don't think that will buy me anything that fsrecover didn't,
and AFAICT the truncation technique is only intended for
when ZOpe won't start at all and fsrecover doesn't help.

I guess it would help if I could identify exactly which
objects have the bad references to which other objects,
and, if necessary, try my hand with a hex editor on a copy. 
but I don't know how much time it would take me to
learn the internals of the zodb enough to pull that off.

I'm currently running fsrefs.py from CVS to see what
that can tell me.

At the moment I'm continuing to run our dev server
on the original "corrupted" ZODB because it seems to
be running fine - no observed problems with it.

I've downloaded the "verbose" tranalyzer script,
it bombs out with "Not enough data left". Here's (part of) the
last TID it finds, and the traceback.


TID: 349E0260EB8ED66 @ 2091722796 obs 48 len 155833 By  dpryer
"/portals/phys/front_images/manage_renameObjects"
        OID: 3d311 len 3336 [BTrees.IOBTree.IOBucket]
                '((U\x0eBTrees.IOBTreeq\x01U\x08IOBucketq\x02tq\x03Nt.((J?h\x18\
x92]q\x04(U\x08PET scanq\x05U\tPET scansq\x06U\x03CTIq\x07U\x1cpositron emission
 tomographyq\x08U\x08Medicareq\tU\x03CMSq\nU\x04HCFAq\x0bU\x07G codesq\x0cU\x08C
MS 1500q\rU\x04UB92q\x0eU\tRVU valueq\x0fU\rreimbursementq\x10eJ\xeaI\xfe\xde]q\
x11(U\x08PET scanq\x12U\tPET scansq\x13U\x03CTIq\x14U\x1cpositron em'...
        OID: 3d41d len 590 [BTrees.OOBTree.OOBucket]
                '((U\x0eBTrees.OOBTreeq\x01U\x08OOBucketq\x02tq\x03Nt.((U\x13pet
_scanner_how.jpgq\x04J\x0e\x82\xfavU\tpet_scansq\x05J\xba\x81\xfavU\tpetgrowthq\
x06J\xf1\x82\xfavU\x16pheochromocytoma_case1q\x07J\xd3\x82\xfavU\x11phy_center_0
1.gifq\x08JI\x82\xfavU\x11phy_center_02.gifq\tJM\x82\xfavU\x11phy_center_03.jpgq
\nJK\x82\xfavU\x11phy_center_04.gifq\x0bJL\x82\xfavU\x11phy_center_05.g'...
        OID: 33754 len 39 [BTrees.Length.Length]
                '((U\rBTrees.Lengthq\x01U\x06Lengthq\x02tq'
        OID: 3d3fd len 1105 [BTrees.IOBTree.IOBucket]


(snipped about 40 more OID's for BTrees stuff )

        OID: 33516 len 1886 [Products.CMFCore.PortalFolder.PortalFolder]
                '((U\x1dProducts.CMFCore.PortalFolderq\x01U\x0cPortalFolderq\x02
tq\x03Nt.}q\x04(U\x17physician_center_07.gifq\x05(U\x08\x00\x00\x00\x00\x00\x035
\x82q\x06(U\x19Products.CMFDefault.Imageq\x07U\x05Imageq\x08ttQU\x12__ac_local_r
oles__q\t}q\nU\x05paulwq\x0b]q\x0cU\x05Ownerq\rasU\x11phy_center_01.gifq\x0e(U\x
08\x00\x00\x00\x00\x00\x035\x83q\x0f(h\x07U\x05Imageq\x10ttQU\x08_objectsq\x11(}
q\x12'...
                33582, 33583, 33584, 33586
                33589, 33587, 33594, 3358a
                3358b, 3358c, 3358d, 3358e
                3358f, 33590, 33591, 33592
                33593, 33597, 33588, 33596
                33595, 33585, 33598
        OID: 33588 len 1854 [Products.CMFDefault.Image.Image]
                '((U\x19Products.CMFDefault.Imageq\x01U\x05Imageq\x02tq\x03Nt.}q
\x04(U\x12__ac_local_roles__q\x05}q\x06U\x05paulwq\x07]q\x08U\x05Ownerq\tasU\x12
_EtagSupport__etagq\nU\x0cts37312723.2q\x0bU\x0eeffective_dateq\x0cNU\x05widthq\
rKZU\x08languageq\x0eU\x00U\x10workflow_historyq\x0f(U\x08\x00\x00\x00\x00\x00\x
036jq\x10(U\x0bPersistenceq\x11U\x11PersistentMappingq\x12ttQU\x0fe'...
                3366a
        OID: 33594 len 1833 [Products.CMFDefault.Image.Image]
                '((U\x19Products.CMFDefault.Imageq\x01U\x05Imageq\x02tq\x03Nt.}q
\x04(U\x12__ac_local_roles__q\x05}q\x06U\x05paulwq\x07]q\x08U\x05Ownerq\tasU\x12
_EtagSupport__etagq\nU\rts37312726.01q\x0bU\x0eeffective_dateq\x0cNU\x05widthq\r
KZU\x08languageq\x0eU\x00U\x10workflow_historyq\x0f(U\x08\x00\x00\x00\x00\x00\x0
36vq\x10(U\x0bPersistenceq\x11U\x11PersistentMappingq\x12ttQU\x0f'...
                33676

6066 137097353297912
Traceback (most recent call last):
  File "tranalyzer_verbose.py", line 372, in ?
    main(sys.argv)
  File "tranalyzer_verbose.py", line 359, in main
    t = Transaction(f, off, refs, classes, pickles)
  File "tranalyzer_verbose.py", line 170, in __init__
    self.obs.append(Record(f, refs, classes, pickles))
  File "tranalyzer_verbose.py", line 216, in __init__
    data = f.c(dl - 8)
  File "tranalyzer_verbose.py", line 105, in c
    self.checkleft(n)
  File "tranalyzer_verbose.py", line 87, in checkleft
    raise 'Not enough data left, %d < %d' % (self.left, n)
Not enough data left, 6066 < 137097353297912


-- 

Paul Winkler
http://www.slinkp.com
Look! Up in the sky! It's PARA- PRETZEL -MAN!
(courtesy of isometric.spaceninja.com)