[ZODB-Dev] Recovering from BTree corruption

Jim Fulton jim at zope.com
Wed Sep 12 10:28:12 EDT 2007


On Sep 11, 2007, at 10:27 AM, Alan Runyan wrote:

>> And, as you said in another node, the BTree folder actually loves in
>> the resources database.
>
> Correct the BTree is in /plone/resources/files to be exact.
>
>
>> Cross database references are inherently weak.  A reference from a
>> foreign database doesn't prevent an object from being treated as
>> garbage.  So, if the only reference to an object is from a foreign
>> database, then the object is considered garbage.  It doesn't sound
>> like this is what's affecting you.  The cross-database reference is
>> to the BTree.  It sounds like the internal references are within
>> database.
>
> Well.  Someone could have 'copy/pasted' a file from the content  
> database
> into the resources/files database.  That could have been one issue.

:(

BTW, I assume you mean cut/paste aka move.

>>>   - checkbtrees.py
>>>   - fstest.py
>>
>> There's an fsrefs script that checks internal references I believe.
>
> fsrefs.py shows loads of problems in both the data.fs and the  
> resources.fs.
> probably > 200 entries per database. i.e.
>
> oid 0xD87110L BTrees._OOBTree.OOBucket
> last updated: 2007-09-04 14:43:37.687332, tid=0x37020D3A0CC9DCCL
> refers to invalid objects:
>         oid ('\x00\x00\x00\x00\x00\xb0+f', None) missing: '<unknown>'
>         oid ('\x00\x00\x00\x00\x00\xb0N\xbc', None) missing:  
> '<unknown>'
>         oid ('\x00\x00\x00\x00\x00\xb0N\xbd', None) missing:  
> '<unknown>'
>         oid ('\x00\x00\x00\x00\x00\xd7\xb1\xa0', None) missing:  
> '<unknown>'
>         oid ('\x00\x00\x00\x00\x00\xc5\xe8:', None) missing:  
> '<unknown>'
>         oid ('\x00\x00\x00\x00\x00\xc3\xc6l', None) missing:  
> '<unknown>'
>         oid ('\x00\x00\x00\x00\x00\xc3\xc6m', None) missing:  
> '<unknown>'
>         oid ('\x00\x00\x00\x00\x00\xcahC', None) missing: '<unknown>'
>         oid ('\x00\x00\x00\x00\x00\xaf\x07\xc1', None) missing:  
> '<unknown>'

Interesting. I wonder if these are actually cross-database references.

> My questions are:
>
>  - I imagine if there are 'invalid' references this is considered  
> "corruption"
>    or "inconsistency"?

I consider this inconsistency. The file structure is intact, but the  
data isn't what it should be.  Not that it matters to the end user  
what we call it.

>
>   - How do I tell if something is a reference to another database?

I don't know how to do this with fsrefs.  I'm not 100% sure that  
fsrefs recognizes cross-database references.


>   - Having these invalid references, is this common to  ZODB  
> applications?

No.


>> Possibly, there's a backup that has data records for the missing  
>> OIDs.
>
> Going to ask hosting company to pull up backups for the past few  
> weeks.
> But how i'm going to find this other than "seeing if the folder  
> allows me
> to iterate over the items" is not throwing POSKeyError.  Does that  
> sound
> like a decent litmus test?

Well. there's also fsrefs.

I'll try to make some time in the next few days to look at this issue.

I'll look at fsrefs a bit more closely to:

   - make sure it understands cross-database references, and

   - Make sure it reports whether missing references are local or  
remote.

I'd like to decide what to do next based on this investigation.  In  
particular, I want to be sure if the problems you are having are  
actually due to cross-database reference issues.

I'll also look at writing a tool that might be able to recover lost  
objects from backup databases.  The idea is that a tool would scan a  
database for missing oids save the list to files, separating  
references to different databases.  Then there'd be another tool that  
would read this list and a list of old database files and scan the  
files looking for oids in the list and extracting records if they are  
found.

I do suspect we need to do something about cross-database  
references.  My long-term plan is to:

- Add an option to file storages to skip garbage collection when  
packing.

- Add a multi-database garbage-collection protocol and tool

In the short term, It might be good to have a mechanism for limiting  
which objects can have cross-database reference to them to limit the  
chance of inadvertent cross-datavase references via move.  This would  
need to be fleshed out though, which takes time.  Perhaps something  
can be done at the zope or plone level in the code for moving objects  
to make sure that objects aren't moved between databases.

Jim

--
Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org





More information about the ZODB-Dev mailing list