[Zope] CorruptTransactionError (Bad news for production site!)

Richard Taylor r.taylor@eris.dera.gov.uk
06 Jul 2000 14:48:41 +0100


Jim

Thank you for your prompt response. This is what I love about using
Open Source Software, the responses come from peole who really know
what they are talking about.

Further responses in-line.



Jim Fulton <jim@digicool.com> writes:

> Richard Taylor wrote:
> > 
> > Today I had to role back two days of transactions from my production
> > site because when I packed the database I was informed of a
> > CorruptTransactionError.
> 
> Did anything else happen previous to this? Did you run out of space
> or anything like that?
>

We had been doing some extensive development work and the ZDB had
reached about 2Gbytes, but the disk was not full. I packed the
database (down to 10M approx.) without any trouble. We then carried
on using the system for another two days and I then packed the
database again. This time I got the CorruptTransactionError. I
followed the instruction to truncate the database and successfully
recovered it. Close examination of the bobo_modification_times on the
objects left in the database showed that the error occurred at about
the time of the first pack.
 
> You should have been able to use Data.fs.old, which is a copy
> of the database before the pack to restore the data. Or was the
> error in there too?  I'd be interested in looking at the Data.fs
> file before the pack to try to figure out what might have gone wrong.
> 
Unfortunately the error occurred after (or during the first pack) the
second pack over-rote the Data.fs.old with the corrupt database. The
real problem was that the corrupt transaction did not have an
immediate affect.

> (If you send my your Data.fs file, please remember to send it
> to me privately and to zip or compress it. :)
>
I would love to send you the Data.fs file but unfortunately it
contents sensitive commercial information for my company and I would
be sacked for sending it out. 

I know how difficult it is to track down bugs when people will not
give you repeatable examples, but I just can't send this stuff out.

> > We are using Zope for an internal knowledge management application
> > where anyone in the organization can add objects. So I have no way of
> > know what was added after the fateful transaction and no way of
> > getting any of it back.
> > 
> > Bummer!
> 
> Indeed.
> 
> > I think this raises a few questions about ZDB:
> > 
> > 1) We need some tools for selectively removing bad transactions
> >    rather than just truncating Data.fs back to the last good one and
> >    loosing everything that comes after it.
> 
> Zope 2.2 has just such a tool. In the ZODB directory, there is a 
> Python script, fsrecover.py which simply calls the recover function
> in the FileStorage module. This will work with any 2.x databases.
> It scans from both the beginning and the end of the file until 
> it finds a corrupted section and then removes the corrupted portion
> from the file.  You utility modifies the file in-place, so you need
> to shut-down the site, or work on a copy when you use it.
>

Fantastic! this is exactly what I was banging on about. No if only I
had not deleted the original corrupt Data.fs file out of discussed I
would be able to get back my stuff (I think I need a serious talking
to.)
 
> > 2) We could do with a tools that can verify the ZODB offline. This
> >    could then be run at regular intervals (maybe once an hour from
> >    cron) so that corruptions can be picked up earlier.
> 
> You could use a little Python script that did something like:
> 
>   import ZODB.FileStorage
>   file_name='../../var/Data.fs'
>   file=open(file_name, 'r+b')
>   index={}
>   vindex={}
>   tindex=[]
>   ZODB.FileStorage.read_index(
>         file, file_name, index, vindex, tindex)
> 
> This basically reads the FileStorage index as would normally
> be done during startup.
>

I shall be installing (and testing) this tonight!

> > 3) Some way to find out what was added after a corrupt transaction is
> >    needed so that at least I could see who I need to ask to re-add
> >    their stuff.
> 
> The fsrecover script should avoid the need for this.
>

Agreed.
 
> Jim
> 

Richard