[Zope-dev] Re: Conflict errors on BDBMinimal storage

Chris McDonough chrism at plope.com
Tue May 18 21:23:29 EDT 2004


On Tue, 2004-05-18 at 16:36, Chris McDonough wrote:
> In my testing, I'm able to see that naive sessioning applications (ones
> that hit the session on every request, whether they really need to or
> not) seem to fail down due to the "abort after 3 conflicts in a row"
> hardcoded policy under extremely high load (~ 40-60 req/sec or so). 
> This will happen regardless of what code I put in the sessioning
> machinery; attempting to prevent it there is a lost cause: it gets a bit
> like trying to prove the Heisenberg Uncertainty Principal or trying to
> predict the future.

Just as a followup: when I observed this, it was because I had all of my
extra debugging code turned on.  True to the Uncertainty Principle, the
execution of the debugging code caused a raft of extra conflicts. 
Turning off the debugging code leads to a much rosier scenario.  In
cases where it was raising conflicts to the end user, it now operates
without doing so.  I still need to do some really hardcore testing of
the thing to see exactly where it buckles, but I can no longer provoke
it trivially.

> As a result, I am coming to believe that along with the "errors as part
> of main" transaction patch for 2.7.1, I should also make the
> retry-on-conflict-error policy pluggable for those who really
> desperately need to slow their Zope systems to a conflict-induced
> crawl.  They can bump it up to 100 as far as I care and if it works for
> them, great.  The more selective people can wait around for a different
> (non-ZODB-based) sessioning implementation or change their code to not
> pound the snot out of the sessioning machinery unnecessarily.

Michael Dunstan also had another good idea.  At least one knob exists in
the source code to increase/decrease the "resolution" of a part of the
sessioning/transience code.  Basically this knob allows you to choose a
tradeoff between the frequency of a class of writes done by the session
code and the "accuracy" of the session "finalization".

Setting the knob higher will decrease the aggregate number of database
writes that would need to be done over any given period of time in order
to keep sessions "current", which might lead to fewer conflicts.  On the
flip side, it would mean that the session might not be "finalized" on
time (sessions are "finalized" when they are garbage collected; sessions
are only allowed to be garbage collected every "n" seconds where n is
the knob's setting).  This would typically be meaningful only to people
who use the "Script to call at session deletion time" feature of
sessions; that script might be called much later than you think it might
be called if you set this value high enough.  There's currently no
guarantee that it will be called exactly when a session expires even
with the default value, but setting the resolution higher will decrease
that probability even further.

Since that feature isn't used very much (at least I don't think it is),
I think this is an ok tradeoff for a lot of people.  I'd need to make
sure that it actually helps something before doing so, but assuming it
did, it would be put into the config file for 2.7.1 as something like
"session-resolution-seconds" or somesuch.

For those of you with conflict problems now, this knob is the PERIOD
variable in Transience.py... you might be able to get away with setting
it higher now to see if that helps anything, especially those people
trying to use sessions+ZEO.  It's preset to 20, which means 20 seconds. 
Try 120 instead, or something around there.  Just don't set it to a
number higher than the timeout value for the TOC (which is in minutes,
remember to multiply it by 60) or all sorts of insanity will happen. 
You'll need to restart Zope and recreate your transient object container
after doing this, as doing this will effectively invalidate all data
previously in the container and using old data left over from running it
under the default PERIOD value might make it go a little insane too.

> As far as I'm concerned, the new Transience stuff is looking really good
> for the common case.  I haven't been able to reproduce any of the
> corruption issues that happened with the older implementation.

This is still true (it's been almost an hour since I posted it, I think
that's a new record!)

- c






More information about the Zope-Dev mailing list