[ZODB-Dev] RelStorage and PosKey errors - is this a risky hotfix?
Anton Stonor
anton at headnet.dk
Thu Jan 27 06:01:14 EST 2011
Hi Shane,
Thanks for pursuing this.
I have lots of other ideas now, but I don't know which to pursue. I need a
> lot more information. It would be helpful if you sent me your database to
> analyze. Some possible causes:
>
> - Have you looked for filesystem-level corruption yet? I asked this before
> and I am waiting for an answer.
>
>
Yep, I've checked for file system consistency and Mysql consistency without
any error reported.
> - Although there is a pack lock, that lock unfortunately gets released
> automatically if MySQL disconnects prematurely. Therefore, it is possible
> to force RelStorage to run multiple pack operations in parallel, which would
> have unpredictable effects. Is there any possibility that you accidentally
> ran multiple pack operations in parallel? For example, maybe you have a
> cron job, or you were setting up a cron job at the time, and you started a
> pack while the cron job was running. (Normally, any attempt to start
> parallel pack operations will just generate an error, but if MySQL
> disconnects in just the right way, you'll get a mess.)
>
>
That's not unlikely! I've actually seen traces of packing invoked TTW,
however the cron job uses zodbpack. I will try to figure out if the PosKeys
starts to surface right after that.
> - Every SQL database has nasty surprises. Oracle, for example, has a nice
> "read only" mode, but it turns out that mode works differently in RAC
> environments, leading to silent corruption. As a result, we never use that
> feature of Oracle anymore. Maybe MySQL has some nasty surprises I haven't
> yet discovered; maybe the MySQL-specific "delete using" statement doesn't
> work as expected.
>
That could also be the case. In fact we have also seen Mysql locking up
longer than expected, but that's another story.
> - Applications can accidentally cause POSKeyErrors in a variety of ways.
> For example, persistent objects cached globally can cause POSKeyErrors.
> Maybe Plone 4 or some add-on uses ZODB incorrectly.
>
I was not aware of that.
Next step here would probably be to inspect log files further and grab a
copy of the dabase before PosKeys started to appear and see if it is
possible to recreate the incident.
Again, thanks.
Anton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.zope.org/pipermail/zodb-dev/attachments/20110127/373ccf90/attachment.html
More information about the ZODB-Dev
mailing list