[ZODB-Dev] Re: BTrees strangeness (was [Zope-dev] Zope 2.XBIGSession problems - blocker - our site dies - need help ofexperienceZope developer, please)

Chris McDonough chrism at plope.com
Wed Mar 3 21:54:03 EST 2004


On Wed, 2004-03-03 at 21:06, Tim Peters wrote:

> > Given that the code you're talking about should raise a
> > ReadConflictError in _gc (I am not using a ZODB with MVCC)
> 
> Out of context, it wasn't evident that the code was used in conjunction with
> ZODB at all

It is.  "self" is always a first-class persistent object called out of a
database at the time when it is being accessed by multiple simultaneous
threads.  There are no globals being mutated by this code, and
definitely none that are BTrees.  I wish there were, then I could just
fix it by adding mutexes and get on with life.  I still may be able to
do that, but I wouldn't understand why it was necessary. ;-)

>  -- BTree objects can be used fine without any persistence in
> play, in which case sharing a BTree across threads without locking can be
> deadly.  For example, this program *appears* to do something very similar,
> and should die quickly with a KeyError in thread 1 or thread 2 (it will vary
> across runs):

Thanks for the demo, very instructive, but I think not applicable here.

> > when two threads try to simultaneously access that data structure
> > where one attempts to obtain the results of "keys()" and the
> > other is attempting to delete a key, why would I need to protect that
> > access with a mutex?
> 
> In the program above, the threads access a single BTree object in memory, so
> any change made by one thread is instantly visible to all others.  I don't
> understand the intended context your code runs in well enough to guess
> whether that's what's happening to you.  I can say threads *appear* to be
> accessing your OOBTree self._data via a shared-across-threads self object,
> but there's really no way for me to guess whether that's true.  If somehow
> the distinct threads all obtain their own working copy of "self" via loading
> it from a database, then they would get their own distinct (in memory)
> working copy of self._data too.  The program above shows that this doesn't
> *necessarily* happen by magic, though, even if the BTree is stored in a
> database -- it depends on the code *using* the code we're staring at.

In Alex's case (the failing case), the ultimate consumer of this code is
the Zope web publishing code, which does indeed pull the root persistent
object and thus all of its subobjects (the "self" in my code being one
of them) out of a database on a per-thread basis as opposed to operating
on a shared global root.

> Printing
> 
>     "%s %s" % (thread.get_ident(), id(self._data))
> 
> across threads would answer that question for sure: if distinct threads
> print the same number for id(self._data), then a single BTree object is
> getting shared across threads, and the code can't work as intended without
> locking.

A snippet of logging data from code that exercises this via multiple
threads by printing your suggestion at the top of _gc:

thread.get_ident(): 81926 ; id(self._data): 1086726400
thread.get_ident(): 65541 ; id(self._data): 1086703784

> > There is no code in Zope AFAIK that employs a mutex for simultaneous
> > access to persistent data; if this is true,
> 
> Ya, but Zope scares me too <0.9 wink>.

It's pretty scary.  That said, it's a living.  You of all people should
know that by now!  ;-)

> > I need to admit to not knowing the rules about knowing when a mutex is
> > required, and it will throw my world into a brief tailspin. ;-)
> 
> You understand Zope better than I do, so you're much closer to the solution:
> just explain all the relevant hidden details of Zope, and I'll give you a
> hundred ways that will work <wink>.

Right.  Well, given the above output, is that enough to convince you
that I probably shouldn't need a mutex here?

- C






More information about the ZODB-Dev mailing list