[Zope-dev] Session Errors

John Eikenberry jae-zdev@kavi.com
Fri, 14 Mar 2003 17:28:40 -0800


Sorry for the length of this one... but I'm trying to braindump to give you
as much info about the problem as possible. 

To be sure it doesn't get lost in my below ramblings, there is probably
important peice of information I haven't mentioned yet... that is that
these errors seem to coincide with the session data timeout setting [1]. I
don't get the errors at all until the timeout is reached or has passed.

[1] The timeout setting I'm refering to is denoted by the label: "Data
object timeout value in minutes" on the /temp_folder/session_data object.


Chris McDonough wrote:

> OK, thanks John.  Let's try one more thing... currently the mounted
> database used to store the session data uses a connection that ignores
> read conflicts.  This is known to be bad because the machinery which
> deals with keeping the sessioning index data will also ignore read
> conflicts, which may create inconcstencies between two data structures
> (BTrees) that need to be kept in sync.

I tried this and it seemed to help some. I haven't seen the get() error
we've been dicussing yet, but a the load() error just occurred (line 94 in
TemporaryStorage - this was error #1 in my original email). Though the
traceback is a bit different from my original email, as the
LowConflictConnection isn't being used. Here's the new Traceback:

Error Type: KeyError
Error Value: [non-ascii chars]

Traceback (innermost last):

    * Module ZPublisher.Publish, line 98, in publish
    * Module ZPublisher.mapply, line 88, in mapply
    * Module ZPublisher.Publish, line 39, in call_object
    * Module Products.DotOrg.Pages.KPage, line 110, in testSession
    * Module Products.DotOrg.Utils.Spawn, line 42, in launchProcess
    * Module Products.DotOrg.Utils.Spawn, line 73, in storeArgs
    * Module Products.Sessions.SessionDataManager, line 180, in
    * _getSessionDataObject
    * Module Products.Transience.Transience, line 175, in new_or_existing
    * Module Products.Transience.Transience, line 797, in get
    * Module Products.Transience.Transience, line 546, in _getCurrentBucket
    * Module ZODB.Connection, line 509, in setstate
    * Module Products.TemporaryFolder.TemporaryStorage, line 94, in load


> Here's a patch to lib/python/Products/TemporaryFolder/TemporaryFolder.py
> that reenables read conflict generation on the database.
> 
> Index: TemporaryFolder.py
> ===================================================================
> RCS file:
> /cvs-repository/Zope/lib/python/Products/TemporaryFolder/TemporaryFolder.py,v
> retrieving revision 1.7
> diff -r1.7 TemporaryFolder.py
> 72c72
> <         db.klass = LowConflictConnection
> ---
> >         #db.klass = LowConflictConnection
> 
> You may see many more conflicts with this running.  But maybe the data
> structures will not become desynchronized.

You weren't kidding about the increase in conflict errors.
 
> Another problem, still unexplained, experienced by Andrew Athan, is that
> if a reference is made to a session data object from within the standard
> error message, somehow things get screwy under high load.  If you're
> doing the same, please let me know.

Before this started happening there was a hasSessionData check getting
called during standard error publishing, though we removed that early this
week when this started happening.

---

It might help you to better understand what might be causing the problem if
you know where we're using sessions and how we can force this problem to
occur. Not sure if this willl be of much help, but thought it couldn't
hurt.

We use sessions primarily as a sort of authenticated user marker. It just
stored their username and a state field that get used in non-authenticated
sections of our site to detect the user as having logged into the site (we
can then raise an unautorized error to get the basic auth info for that
user). Anyways, these calls happen on our basic Content class (subclassed
from DTMLMethod) in its __call__() method. We use it a couple other places
for small things, but this one sees the most use.

I've figured out how to force these errors to happen to some extent. I've
written a method that starts up a thread, which uses Client.call to call
another method, which then basically just loops endlessly calling
hasSessionData and getSessionData, incrementing a number in the session
data and sleeping for a N number of seconds between loops. One of these
guys will run forever without a problem.

Once you start a second thread ReadConflictErrors start getting raised.
Which thread gets the conflict and which one keeps working seems variable
(probably just a timing thing). If I start enough of these threads I can
cause the error to happen. But only once the session timeout is reached.

Note that to help speed up getting the errors I either set the session time
to 1 minute via _setTimeout() call or even manually tweak the appropriate
session data managers attributes (_timeout_secs, _period and
_timeout_slices) to very small values (ie. a few seconds).


-- 

John Eikenberry [jae@kavi.com]
______________________________________________________________
"A society that will trade a little liberty for a little order
 will deserve neither and lose both."
                                          --B. Franklin