[ZODB-Dev] zeopack error

Marius Gedminas marius at gedmin.as
Wed Feb 8 23:25:48 UTC 2012


On Wed, Feb 08, 2012 at 01:24:55PM +0100, Kaweh Kazemi wrote:
> Recap: last week I examined problems I had packing our 4GB users
> storage. With Martijn's help I was able to fix zeo's exception output
> and write out the first broken pickle that throws an exception.
...
> You can download the broken pickle from here:
> http://www.reversepanda.com/download/brokenpickle
> 
> If someone has more experience in parsing and understanding pickles in
> regards to ZODB3, any help would be appreciated.

I don't have much experience here, but I love a puzzle

    >>> import pickletools
    >>> f = open('brokenpickle', 'rb')

A ZODB record consists of two pickles: the first stores the class of
the object, the other stores the state of the object

    >>> pickletools.dis(f)
        0: c    GLOBAL     'rp.odb.containers EntityMapping'
       33: q    BINPUT     1
       35: .    STOP
    highest protocol among opcodes = 1

    >>> pickletools.dis(f)
       36: }    EMPTY_DICT
       37: q    BINPUT     2
       39: U    SHORT_BINSTRING 'data'
       45: q    BINPUT     3
       47: }    EMPTY_DICT
       48: q    BINPUT     4
       50: (    MARK
       51: ]        EMPTY_LIST
       52: q        BINPUT     5
       54: (        MARK
       55: U            SHORT_BINSTRING 'm'
       58: (            MARK
       59: U                SHORT_BINSTRING 'game'
       65: q                BINPUT     6
       67: U                SHORT_BINSTRING '\x00\x00\x00\x00\x00\x00\tT'
       77: q                BINPUT     7
       79: c                GLOBAL     'game.objects.item Tool'
      103: q                BINPUT     8
      105: t                TUPLE      (MARK at 58)
      106: q            BINPUT     9
      108: e            APPENDS    (MARK at 54)
      109: Q        BINPERSID
      110: K        BININT1    1
      112: ]        EMPTY_LIST
      113: q        BINPUT     10
      115: (        MARK
      116: U            SHORT_BINSTRING 'm'
      119: (            MARK
      120: h                BINGET     6
      122: U                SHORT_BINSTRING '\x00\x00\x00\x00\x00\x00\x12\x03'
      132: q                BINPUT     11
      134: c                GLOBAL     'game.objects.item EnergyPack'
      164: q                BINPUT     12
      166: t                TUPLE      (MARK at 119)
      167: q            BINPUT     13
      169: e            APPENDS    (MARK at 115)
      170: Q        BINPERSID
      171: K        BININT1    1
      173: u        SETITEMS   (MARK at 50)
      174: s    SETITEM
      175: .    STOP
    highest protocol among opcodes = 1

No secret calls to instantiate 'os.system' with 'rm -rf' as an argument,
so I feel safe to try and unpickle it ;-)

    >>> import sys, pickle, pprint
    >>> sys.modules['rp.odb.containers'] = sys.modules['__main__'] # hack
    >>> sys.modules['rp.odb'] = sys.modules['__main__'] # hack
    >>> sys.modules['rp'] = sys.modules['__main__'] # hack
    >>> class EntityMapping(object): pass
    ...
    >>> f.seek(0)
    >>> pickle.load(f)
    <class '__main__.EntityMapping'>

(this is a good place to do a f.tell() and remember the position -- 36
in this case -- so you can f.seek(36) as you iterate trying to make the
second pickle load)

    >>> sys.modules['game.objects.item'] = sys.modules['__main__'] # hack
    >>> sys.modules['game.objects'] = sys.modules['__main__'] # hack
    >>> sys.modules['game'] = sys.modules['__main__'] # hack
    >>> class Tool(object): pass
    ... 
    >>> class EnergyPack(object): pass
    ... 
    >>> unp = pickle.Unpickler(f)
    >>> unp.persistent_load = lambda oid: '<persistent reference %r>' % oid
    >>> pprint.pprint(unp.load())
    {'data': {"<persistent reference ['m', ('game', '\\x00\\x00\\x00\\x00\\x00\\x00\\tT', <class '__main__.Tool'>)]>": 1,
              "<persistent reference ['m', ('game', '\\x00\\x00\\x00\\x00\\x00\\x00\\x12\\x03', <class '__main__.EnergyPack'>)]>": 1}}

Those look like cross-database references to me.

The original error (aaaugh Mutt makes it hard for me to look upthread
while I'm writing a response) was something about non-hashable lists?
Looks like a piece of code is trying to put persistent references into a
dict, which can't possibly work in all cases.

See ZODB.serialize.ObjectReader._persistent_load for the canonical
parser of the various possible formats.

ZODB.ConflictResolution.PersistentReference.__init__ is much clearer,
though perhaps a tiny bit less canonical.

> During my checks I realized that running the pack in a Python 2.7
> environment (using the same ZODB version - 3.10.3) works fine, the
> pack reduces our 4GB storage to 1GB. But our production server uses
> Python 2.6 (same ZODB3.10.3) which yields the problem (though the test
> had been done on OS X 10.7.3 - 64bit, and the production server is
> Debian Squeeze 32bit).

I've no idea why running the same ZODB version on Python 2.7 instead of
2.6 would make this error go away.

Incidentally, since you use cross-database references, please make sure
they continue to work after you pack your storage.  I've lost data that
way (the ZODB garbage collector doesn't see references that exist in
other storages, and can assume objects are garbage when it shouldn't).
Packing with GC disabled ought to be safe.

Marius Gedminas
-- 
The world is really obsessing over the UI preferences of the person who gave us
git?
        -- Matthew Garrett
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://mail.zope.org/pipermail/zodb-dev/attachments/20120209/3547f4f4/attachment.sig>


More information about the ZODB-Dev mailing list