[ZODB-Dev] Persistent ZEO Cache corruption?

Sidnei da Silva sidnei at enfoldsystems.com
Thu Jan 12 11:04:24 EST 2006


On Thu, Jan 12, 2006 at 10:17:54AM -0500, Tim Peters wrote:
| [Sidnei da Silva]
| >> Every now and then I face a corruption of the persistent zeo cache, but
| >> this is the first time I get this variant.
| 
| What other variants do you see?

Can't remember right now, it was quite some time ago and involved
making changes to one zeo client while the other one was down using
'zopectl debug'. Seen it about 6 times in different environments, so
should be reproduceable.

| >> The cause is very likely to be a forced shutdown of the box this zope
| >> instance was running on, but I thought it would be nice to report the
| >> issue.
| 
| Yes it is!  Thank you.  It would be better to open a bug report ;-).

Sure will.

| >> Here's the traceback::
| >>
| >> File "/home/sidnei/src/zope/28five/lib/python/ZEO/ClientStorage.py", line
| 314, in __init__
| >>   self._cache.open()
| >> File "/home/sidnei/src/zope/28five/lib/python/ZEO/cache.py", line 112, in
| open
| >>    self.fc.scan(self.install) File
| >> "/home/sidnei/src/zope/28five/lib/python/ZEO/cache.py", line 835, in scan
| >>    install(self.f, ent) File
| >> "/home/sidnei/src/zope/28five/lib/python/ZEO/cache.py", line 121, in
| install
| >>   o = Object.fromFile(f, ent.key, skip_data=True)
| >> File "/home/sidnei/src/zope/28five/lib/python/ZEO/cache.py", line 630, in
| fromFile
| >>   raise ValueError("corrupted record, oid")
| >> ValueError: corrupted record, oid
| >>
| >> I have a copy of the zeo cache file if anyone is interested.
| 
| Attaching a compressed copy to the bug report would be best (if it's too big
| for that, or it's proprietary, let me know how to get it and I'll put it on
| an internal ZC machine).  Can't tell in advance whether that will reveal
| something useful, though (see below).

Don't think there might be anything sensitive in there, maybe my blog
password in the worst case *wink*. Here's the files (zeo1-1.zec is
probably the one you're after):

http://awkly.org/files/zeo-cache.tar.bz2

| > It seems as though persistent caches haven't been a very sucessful
| > feature. Perhaps we should abandon them.
| 
| They do seem to be implicated in more than their share of problems, both
| before and after MVCC.
| 
| The post-MVCC ZEO persistent cache _intends_ to call flush() after each file
| change.  If it's missing one of those, and depending on what "forced
| shutdown" means exactly, that could be a systematic cause for corruption.
| It doesn't call fsync() unless it's explicitly closed cleanly, but it's
| unclear what good fsync() actually does across platforms when flush() is
| called routinely and the power stays on.

Oh, I really meant to say "accidental shutdown", though I wasn't around
when the box restarted it looks like it was a power failure.

-- 
Sidnei da Silva
Enfold Systems, LLC.
http://enfoldsystems.com


More information about the ZODB-Dev mailing list