[ZODB-Dev] Weird errors with Zope 2.7.7

Tim Peters tim.one at comcast.net
Mon Oct 17 18:27:19 EDT 2005


[Chris Withers]
>>> We've been seeing these errors for a while now, but had assumed it was
>>> due to an old Zope version. We've been running 2.7.7 for a week or two
>>> now and we're still seeing these intermittently:

...

>>> AttributeError: ClientCache instance has no attribute '_get'

[Tim Peters]
>> At any particular time(s)?  For example, do these (either or both) tend
>> to occur when a ZEO client is starting up?  Shutting down?  In the
>> middle of their lives?  No correlation?

[Chris]
> Both at random times while the servers are running...

Then-- sorry --I don't have a clue.  Like I said, I don't see any code
capable of removing a ClientCache's `_get` attribute after the cache has
been opened, and I don't see a way for a ZEO client cache to work at all
without its _get attribute (_get is used on every load and every
invalidate).  If this only occurred at client startup time, then it might
get pinned on a startup race.  But if it occurs at random times while the
servers are running, I have no theory.

...

>> A ClientCache doesn't have a `_get` attribute when it's first
>> constructed; for whatever reason, it doesn't get one until
>> ClientCache.open() is called. I don't see any code capable of removing
>> the `_get` attribute from a ClientCache after its open() is called, so
>> looks like a safe bet its open() has never been called.

> Now that _is_ weird, especially given the other error below...

I doubt it's related to your

    Shouldn't load state for 0x660bab when the connection is closed

problems except by coincidence.

>> It looks like open() is "supposed to be" called sometime during initial
>> cache verification.  That code is complicated.

> Hmmm, Tim says the code is complicated, I guess I should be prepared to
> wipe my brains off the wall should I try and read it?

Up to you, but you should know I gave up <wink>.  If you do, you'll
eventually figure out that there's a wide-open path in ClientStorage.py that
never calls:

    self._cache.open()

I got hopeful at that point, but turns out it's a red herring.  If you look
at ClientCache.py's ClientCache.verify(), you'll see that it unconditionally
calls self.open() (which is the same thing).  So there's no way to call
ClientCache.verify() without open()'ing the client cache as a side effect.

>> It would help to have more of the log context, especially the INFO-level
>> log messages produced by this ZEO client during startup:  this could be
>> a ZEO startup race, but there are so many possible paths thru that code
>> that more clues are needed.

> *goes hunting* (will include at end of mail)

The log shows that your client takes the "full cache verification" path.  We
know this because it says, e.g.,

    2005-10-11T17:57:58 INFO(0) ZCS:13033 Verifying cache

and "Verifying cache" is produced from only one place in ZODB.  That's this
block in ClientStorage.verify_cache:

        log2(INFO, "Verifying cache")
        # setup tempfile to hold zeoVerify results
        self._tfile = tempfile.TemporaryFile(suffix=".inv")
        self._pickler = cPickle.Pickler(self._tfile, 1)
        self._pickler.fast = 1 # Don't use the memo

        self._cache.verify(server.zeoVerify)
        self._pending_server = server
        server.endZeoVerify()
        return "full verification"


Note that this calls self._cache.verify(), and that's exactly the verify()
method I talked about just above (the one that opens the cache immediately,
giving the cache its _get attribute).

I suppose it's _conceivable_ that there's some obscure race allowing a load
or invalidation attempt to sneak in here, before self._cache.verify() is
called.  But if so, that's necessarily part of ZEO client startup, you say
it's not correlated with startup, and your logs don't show anything unusual
between "Verifying cache" and "endVerify finishing".

>>> ERROR(200) ZODB Shouldn't load state for 0x6c33b4 when the connection
>>> is closed
>>
>>> ...which is a bit more distrubing.

>> That one comes out of Connection.setstate(), and appears to mean what
>> it says:  the app is trying to load an object from this Connection but
>> the Connection has been closed.

> Right, could this be related to the _get error?

I don't think so.

> (ie: if _get only gets created when the connection is opened, and it's
> not there, then maybe the connection (which connection?!) has never
> been opened and so barfs with the "shouldn't load state" error when
> something tries to load an object from it?)

setstate() doesn't _create_ an object, it just adds state to a ghost object
that already exists.  A ghost first comes into being when some other
persistent object is loaded, which contains the ghost object.  The newly
created ghost object is then associated with the Connection used to load the
containing persistent object.  It remains a ghost until something references
one of its attributes, and only then does setstate() get called on it.

IOW, for a ghost to exist and be associated with a Connection at all, it
must be the case that some other persistent object was previously loaded
successfully from that same Connection.  That in turn is impossible if the
Connection has never been opened.

As before, this "Shouldn't load state ..." error is almost certainly due to
a logic error in some product you're using, or in Zope.  Take this message
as meaning exactly what it says:  something is trying to work with a
persistent object after the Connection it came from was closed.  That's an
app problem, not a ZODB problem.  Have you tracked down what kind of object
(say) oid 0x6c33b4 is?  That may give a good clue.

...
> (pps: kinda interested what those conflict errors are, read or write? how
> do I find out?)

ZODB raises ReadConflictError specifically for a read conflict.  It raises
ConflictError for a write conflict.  Whatever code in Zope produces the log
messages apparently suppresses the distinction.  This is easy to do, since
ReadConflictError is a subclass of ConflictError; I _guess_ that code just
catches the base ConflictError to capture both.  Poking around in Zope might
help (maybe there's an easy way to change it to preserve the distinction).




More information about the ZODB-Dev mailing list