[ZODB-Dev] ZODB Problem

Fri Jan 28 01:36:40 EST 2005

[Tim Peters]
>> ... Don't know whether "0032" and "0033" are decimal or hex, but it
>> looks like they're early transactions in the life of this database.
>> Given that FileStorage grows only by appending:
>>
>>    http://zope.org/Wikis/ZODB/FileStorageBackup
>>
>> do you have any idea how bytes early in the file may have gotten
>> damaged? There are rare reports of that, but outside of flaky hardware
>> or system software no cause is evident.  It's scary when once-good data
>> goes bad for no identifiable reason.

[Bob Horvath]
> I have seen this comment made several times on this list, and while it
> may in fact be true, it seems suspect to me.  Considering how many people
> come out of the woodwork complaining about corrupted Data.fs files,

How many is that?  Seriously.  When I posted a call to several Zope mailing
lists several months ago begging for people with corrupt .fs files to speak
up, we got exactly two responses.  Their .fs files were indeed damaged, but
no cause was ever determined.  Note that I distinguish between file
corruption and, e.g., POSKeyErrors.  The latter are rarely instances of the
former (although may be).

> if it were flaky hardware or system software, you would think lots of
> non-ZODB files and applications would have similar problems.

Why?  An .fs file may well be hammered on 24/7, and every bit's value is
crucial.  The latter makes it severely sensitive to all sorts of system
problems, and the former ensures that if an intermittent problem of some
sort exists, an .fs file is much more likely than most other files to suffer
from it.

The last case of .fs corruption that was definitively nailed got blamed on a
RAID controller that failed intermittently, and only under heavy load.  Once
they suspected that, they were able to prove it independent of ZODB
activity.  Of course in real life, their .fs file _was_ "at the other end"
of heavy load more often than most other files.

Jim likes telling the tale of a livid customer who pushed hard enough that
he offered to examine their damaged .fs file in detail.  This hit a snag,
because the tarball they sent him was itself corrupt.  That managed to
convince them that their file-corruption problem wasn't unique to ZODB, but
nothing before that sufficed ("everything else works fine!", etc).

In my time here, no software cause for .fs corruption has ever been
identified.  This is again in contrast to POSKeyErrors, where several
relevant software bugs (in Zope, in ZODB, and in ZEO) have been found and
fixed over my time here.

> My gut suspicion is that these are caused by some yet to be identified
> software bug.

I'm open to that, but the lack of any discernible pattern argues against it.
There's also that FileStorage's disk usage *is* very simple (read the code
-- maybe you'll find a relevant bug! seriously -- "more eyeballs" is part of
what open source is about).  And that multi-machine multi-process
multi-threaded ZEO stress tests I've run full-bore for days have provoked
any number of software bugs, but never a case of .fs corruption.  And that
Zope Corp never sees .fs corruption in its own deployments.  And that, as it
says at

    http://zope.org/Wikis/ZODB/FileStorageBackup

most people with corruption problems who move to a different box stop seeing
the problems (in fact, if anyone who tried that still saw corruption
problems, I haven't heard about it -- "all" is my direct knowledge, and
"most" was just weasel-wording to leave other possibilities open).

I worked for hardware manufacturers for 15 years, and tracked down about six
miserable HW bugs in that time (I wrote compilers for a living, and these
came in as "your compiler is buggy" reports).  These were in CPUs and FPUs,
not I/O, but HW is HW.  When it goes wrong, it can be nearly impossible to
reproduce, requiring the simultaneous occurrence of several highly unlikely
events.  .fs corruption "feels" more like that to me than like most software
bugs; the only other thing in my experience close to it is tiny thread-race
holes.  But even the latter seem much easier to provoke than .fs corruption
(which I've never been able to provoke).

> Has anyone ever tried to compile data on what sorts of machines or
> operating systems corrupt files seem to occur on?

Not that I know of, and reports of corrupt files (again I'm not talking
about POSKeyErrors) are so rare that I think it would be hard to get a
significant sample (as mentioned before, the entire universe we were able to
find out about last time we tried consisted of two).  All cases I'm aware of
were reported against Linux, but I doubt that means anything.

> If it is flaky hardware, you would think you would have more than one
> person at the same hosting outfit with similar problems.

See above.

> If it is system software, then you woudl see some correlation on
> Windows, vs. Linux, vs. FreeBSD, etc.

Not enough cases known.  Buggy Python C extension modules can be at fault
too (ditto any code running in the same process), which is one reason we
recommend running ZEO even if the ZEO server is on the same physical box as
the ZEO client (i.e., to isolate the process mucking with the .fs file as
much as possible from application code).