[ZODB-Dev] FileStorage modifiedInVersion KeyError -- how to correct and/or debug?

Joseph Wayne Norton norton@alum.mit.edu
Wed, 21 Nov 2001 17:12:06 +0900


Jeremy -

I finally was able to spend some time to investigate this issue.

I ran the fstest.py tool on my data.fs and it is reports the error
output listed below.  The application that is using this data.fs is
working properly and stable (to the best of my knowledge) but
typically while modifying content we receive an error on zeo server
side such as follows:

------
2001-11-21T07:33:23 ERROR(200) ZEO Server error

Traceback (innermost last):
  File
/opt/arseed/tfs-lib/zope/zope-2.4.3/lib/python/ZEO/StorageServer.py,
line 269, in message_input
  File
/opt/arseed/tfs-app/ios04-zeo01-2.3/lib/python/ZODB/FileStorage.py,
line 622, in modifiedInVersion
    (Object: /opt/arseed/tfs-app/ios04-zeo-2.3/var/data/Data.fs)
KeyError: 

and lately on the zeo clients (my guess is a client cacheing problem)
side we are seeing the following error:

------
2001-11-21T08:04:23 ERROR(200) ZODB Couldn't load state for
'\x00\x00\x00\x00\x00\x07\xcew'
Traceback (innermost last):
  File
/opt/arseed/tfs-lib/zope/zope-2.4.3/lib/python/ZODB/Connection.py,
line 519, in setstate
AttributeError: 'None' object has no attribute 'load'


Any recommendations on how to 1) correct the trouble with the data.fs
file and 2) debug these messages on the zeo clients?

We are currently using python 2.1.1, zope 2.4.3, and zeo 1.0b5 on the
solaris platform.

thanks,

- joe




------
OUTPUT of ftest.py:

         4: transaction tid 0x033c4973b26c1500 #0 
      2552: transaction tid 0x033c4973b2b82d4c #1 
      2819: transaction tid 0x033c4974bb8a2b22 #2 
      3050: transaction tid 0x033c4979978dadd5 #3 
      8278: transaction tid 0x033c4979982678cc #4 
     13583: transaction tid 0x033c4979b990de99 #5 
     51055: transaction tid 0x033c4979c1e04d22 #6 
     56330: transaction tid 0x033c4979c441545d #7 
    202524: transaction tid 0x033c4979cd9fd05d #8 
    207785: transaction tid 0x033c4979ced1ac33 #9 
    232035: transaction tid 0x033c4979df918a91 #10 
    309651: transaction tid 0x033c497a03961300 #11 
    441681: transaction tid 0x033c497a0dc85abb #12 
    446962: transaction tid 0x033c497a0e70a1dd #13 
    452203: transaction tid 0x033c497a17e6fd88 #14 
    457520: transaction tid 0x033c497a22d694cc #15 
    462750: transaction tid 0x033c497a2a2997aa #16 
    484929: transaction tid 0x033c497a350a6eaa #17 
    508305: transaction tid 0x033c497a3d9edbf7 #18 
    513566: transaction tid 0x033c497a4fea7e91 #19 
    575490: transaction tid 0x033c497a57dc9477 #20 
    581075: transaction tid 0x033c497a5d612d88 #21 
    626582: transaction tid 0x033c497af0d28577 #22 
    626822: transaction tid 0x033c49802d22daf7 #23 
    632080: transaction tid 0x033c4983ae760baa #24 
    685471: transaction tid 0x033c4983b1bf85bb #25 
    685863: transaction tid 0x033c4984d5df0644 #26 
  20890344: transaction tid 0x033c4985ec7355ee #27 
  20995508: transaction tid 0x033c4986a3000980 #28 
  21003296: transaction tid 0x033c4986a3e553f7 #29 
  21007296: transaction tid 0x033c49874d3009f7 #30 
  21015203: transaction tid 0x033c4990f89b1c4c #31 
  21015425: transaction tid 0x033c49953648fbaa #32 
  21023364: transaction tid 0x033c4995ba0b7077 #33 
  21032374: transaction tid 0x033c49963022d488 #34 
  21040310: transaction tid 0x033c4997a1dbf9d5 #35 
  21048241: transaction tid 0x033c49e6f4eabaaa #36 
  21050780: transaction tid 0x033c4bc1e2db9be6 #37 
  21051069: transaction tid 0x033c4dde8f73bcd5 #38 
  21052063: transaction tid 0x033c4de094481add #39 
  21053445: transaction tid 0x033c4dea2ab40b80 #40 
  21054790: transaction tid 0x033c4dea722bfaa2 #41 
21056445 object serialno 0x033c4a6b5a2cb5f7 does not matchtransaction id 0x033c4e10d1692c11

My guess is there are more of these invalid transactions, but not
positive.



At Wed, 10 Oct 2001 14:27:34 -0400 (EDT),
Jeremy Hylton wrote:
> 
> There's a new tool that was added to StandaloneZODB recently that does
> some sanity checking of the Data.fs file.  It checks all the
> transaction meta-data and reports when things are obviously amiss.
> It's not perfect, because the data itself might be corrutped in ways
> that don't affect the meta-data, but it should catch real errors.
> 
> You can get it from cvs.zope.org in StandaloneZODB/Tools/fstest.py.
> I've included a copy at the end of the message.  If it doesn't report
> any problems for the Data.fs, I don't think you need to worry about
> the index.
> 
> Jeremy
> 
> #! /usr/bin/env python
> """Simple consistency checker for FileStorage.
> 
> usage: fstest.py [-v] data.fs
> 
> The fstest tool will scan all the data in a FileStorage and report an
> error if it finds any corrupt transaction data.  The tool will print a
> message when the first error is detected an exit.
> 
> The tool accepts one or more -v arguments.  If a single -v is used, it
> will print a line of text for each transaction record it encounters.
> If two -v arguments are used, it will also print a line of text for
> each object.  The objects for a transaction will be printed before the
> transaction itself.
> 
> Note: It does not check the consistency of the object pickles.  It is
> possible for the damage to occur only in the part of the file that
> stores object pickles.  Those errors will go undetected.
> """
> 
> # The implementation is based closely on the read_index() function in
> # ZODB.FileStorage.  If anything about the FileStorage layout changes,
> # this file will need to be udpated.
> 
> import string
> import struct
> import sys
> 
> class FormatError(ValueError):
>     """There is a problem with the format of the FileStorage."""
> 
> class Status:
>     checkpoint = 'c'
>     undone = 'u'
> 
> packed_version = 'FS21'
> 
> TREC_HDR_LEN = 23
> DREC_HDR_LEN = 42
> 
> VERBOSE = 0
> 
> def hexify(s):
>     """Format an 8-bite string as hex"""
>     l = []
>     for c in s:
>         h = hex(ord(c))
>         if h[:2] == '0x':
>             h = h[2:]
>         if len(h) == 1:
>             l.append("0")
>         l.append(h)
>     return "0x" + string.join(l, '')
> 
> def chatter(msg, level=1):
>     if VERBOSE >= level:
>         sys.stdout.write(msg)
> 
> def U64(v):
>     """Unpack an 8-byte string as a 64-bit long"""
>     h, l = struct.unpack(">II", v)
>     if h:
>         return (h << 32) + l
>     else:
>         return l
> 
> def check(path):
>     file = open(path, 'r')
> 
>     file.seek(0, 2)
>     file_size = file.tell()
>     if file_size == 0:
>         raise FormatError("empty file")
>     file.seek(0)
>     if file.read(4) != packed_version:
>         raise FormatError("invalid file header")
> 
>     pos = 4L
>     tid = '\000' * 8 # lowest possible tid to start
>     i = 0
>     while pos:
>         _pos = pos
>         pos, tid = check_trec(path, file, pos, tid, file_size)
>         if tid is not None:
>             chatter("%10d: transaction tid %s #%d \n" %
>                     (_pos, hexify(tid), i))
>             i = i + 1
> 
> 
> def check_trec(path, file, pos, ltid, file_size):
>     """Read an individual transaction record from file.
> 
>     Returns the pos of the next transaction and the transaction id.
>     It also leaves the file pointer set to pos.  The path argument is
>     used for generating error messages.
>     """
>     
>     h = file.read(TREC_HDR_LEN)
>     if not h:
>         return None, None
>     if len(h) != TREC_HDR_LEN:
>         raise FormatError("%s truncated at %s" % (path, pos))
> 
>     tid, stl, status, ul, dl, el = struct.unpack(">8s8scHHH", h)
>     if el < 0:
>         el = t32 - el
>     tmeta_len = TREC_HDR_LEN + ul + dl + el
> 
>     if tid <= ltid:
>         raise FormatError("%s time-stamp reduction at %s" % (path, pos))
>     ltid = tid
> 
>     tl = U64(stl) # transaction record length - 8
>     if pos + tl + 8 > file_size:
>         raise FormatError("%s truncated possibly because of"
>                           " damaged records at %s" % (path, pos))
>     if status == Status.checkpoint:
>         raise FormatError("%s checkpoint flag was not cleared at %s" 
>                           % (path, pos))
>     if status not in ' up':
>         raise FormatError("%s has invalid status '%s' at %s" %
>                           (path, status, pos))
> 
>     if tmeta_len > tl:
>         raise FormatError("%s has an invalid transaction header"
>                           " at %s" % (path, pos))
> 
>     tpos = pos
>     tend = tpos + tl
> 
>     if status != Status.undone:
>         pos = tpos + tmeta_len
>         file.read(ul + dl + el) # skip transaction metadata
> 
>         i = 0
>         while pos < tend:
>             _pos = pos
>             pos, oid = check_drec(path, file, pos, tpos, tid)
>             if pos > tend:
>                 raise FormatError("%s has data records that extend beyond"
>                                   " the transaction record; end at %s" %
>                                   (path, pos))
>             chatter("%10d: object oid %s #%d\n" % (_pos, hexify(oid), i),
>                     level=2)
>             i = i + 1
> 
>     file.seek(tend)
>     rtl = file.read(8)
>     if rtl != stl:
>         raise FormatError("%s has inconsistent transaction length"
>                           " for undone transaction at %s" % (path, pos))
>     pos = tend + 8
>     return pos, tid
> 
> def check_drec(path, file, pos, tpos, tid):
>     """Check a data record for the current transaction record"""
> 
>     h = file.read(DREC_HDR_LEN)
>     if len(h) != DREC_HDR_LEN:
>         raise FormatError("%s truncated at %s" % (path, pos))
>     oid, serial, _prev, _tloc, vlen, _plen = (
>         struct.unpack(">8s8s8s8sH8s", h))
>     prev = U64(_prev)
>     tloc = U64(_tloc)
>     plen = U64(_plen)
>     dlen = DREC_HDR_LEN + (plen or 8)
> 
>     if serial != tid:
>         raise FormatError("%s object serialno %s does not match"
>                           "transaction id %s" % (pos, hexify(serial),
>                                                  hexify(tid)))
> 
>     if vlen:
>         dlen = dlen + 16 + vlen
>         file.seek(8, 1)
>         pv = U64(file.read(8))
>         file.seek(vlen, 1) # skip the version data
> 
>     if tloc != tpos:
>         raise FormatError("%s data record exceeds transaction record"
>                           "at %s" % (path, pos))
> 
>     pos = pos + dlen
>     # XXX is the following code necessary?
>     if plen:
>         file.seek(plen, 1)
>     else:
>         file.seek(8, 1)
>         # XXX _loadBack() ?
> 
>     return pos, oid
> 
> def usage():
>     print __doc__
>     sys.exit(-1)
> 
> if __name__ == "__main__":
>     import getopt
> 
>     try:
>         opts, args = getopt.getopt(sys.argv[1:], 'v')
>         if len(args) != 1:
>             raise ValueError, "expected one argument"
>         for k, v in opts:
>             if k == '-v':
>                 VERBOSE = VERBOSE + 1
>     except (getopt.error, ValueError):
>         usage()
> 
>     try:
>         check(args[0])
>     except FormatError, msg:
>         print msg
>         sys.exit(-1)
> 
>     chatter("no errors detected")
>