[ZODB-Dev] Tracking down causes for the recent POSKeyError problems we get ...
Joachim Werner
joe@iuveno-net.de
Sun, 12 Jan 2003 13:28:19 +0100
Hi!
A while ago I posted about frequent POSKeyErrors we are getting. Before
blaming FileStorage I'd like to track down the possible causes on
application level.
For that I'd need some hints:
Running fstest.py returns errors like this (I modified it to not stop at
the first error it encounters):
606435173 object serialno 0x0348d1012f1eebc4 does not match transaction
id 0x0348d101f2f4aff7
622613103 object serialno 0x0347852012dcb4a2 does not match transaction
id 0x0348f77b1bf2f866
622613454 object serialno 0x034800c732d74de6 does not match transaction
id 0x0348f77b1bf2f866
I haven't found the time yet to dive into ZODB internals. What I'd need
to know is how I can get Zope (or some Python script instead) to return
the actual object involved (e.g. formatted as an XML export). I want to
be able to see if the errors are related to a certain meta_type, type of
transaction (e.g. copy&paste, object creation, ...) or time.
I have not verified this yet, but I think the errors fstest.py returns
relate to the problems we frequently get: It seems that certain objects
get lost when the database is packed. So they are still referenced
somewhere, but the record is removed. This then causes a POSKeyError as
soon as Zope tries to use the object.
I'd appreciate any kind of hints that help me understand what is going on:
- How does the packing algorithm find out if an object can be removed?
- How can errors like the above (serialno doesn't match transaction id)
occur?
I am really scared because I didn't have a single case of ZODB data
corruption for years and now we are getting them on a weekly, sometimes
daily base ...
Cheers
Joachim