[ZODB-Dev] ZODB Problem

Jim Fulton jim at zope.com
Fri Jan 28 09:31:30 EST 2005


Bob Horvath wrote:
> Tim Peters wrote:
> 
>> [Alexey V Paramonov]
>>  
>>
>>> I have already solved the problem:
>>>   
>>
>>
>> Congratulations!
>>
>>  
>>
>>> I have several backups of Data.fs, one of them (very old) was "alive".
>>> So, I starget to cut pieces from the end of Data.fs and tried to load
>>> it, so after a few hours I knew which trasaction was malfromed.
>>>   
>>
>>  
>>
> ...
> 
>> Now you know why there's no automated way to recover from arbitrary 
>> damage.
>> Don't know whether "0032" and "0033" are decimal or hex, but it looks 
>> like
>> they're early transactions in the life of this database.  Given that
>> FileStorage grows only by appending:
>>
>>    http://zope.org/Wikis/ZODB/FileStorageBackup
>>
>> do you have any idea how bytes early in the file may have gotten damaged?
>> There are rare reports of that, but outside of flaky hardware or system
>> software no cause is evident.  It's scary when once-good data goes bad 
>> for
>> no identifiable reason.
>>
>>  
>>
> 
> I have seen this comment made several times on this list, and while it 
> may in fact be true, it seems suspect to me. 

That's understandable.

Note that *this* particular report is especially weird, since it isn't
really database corruption. Rather, valid data got written with the wrong
object id.


 > Considering how many
> people come out of the woodwork complaining about corrupted Data.fs 
> files,

Keep in mind that this number is far smallter than the number of people
who use ZODB in production without complaint.


 > if it were flaky hardware or system software, you would think
> lots of non-ZODB files and applications would have similar problems. 

That's how we are usually able to verify that faulty hardware or system
software is at fault. Other files are eventually found to be affected
or there's other hard evidence.


> My 
> gut suspicion is that these are caused by some yet to be identified 
> software bug.

Of course, that's a possibility.  There have been bugs that have caused
problems.  Except in the remote past, these bugs havebn't caused corruption.
Rather they cause other problems, like dangling object references. Usually,
when software is at fault, there is a widespread occurence with consistent
symptoms that lead us to a solution, as happened recently with POSKeyErrors.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


More information about the ZODB-Dev mailing list