[ZODB-Dev] Re: BTrees strangeness (was [Zope-dev] Zope 2.X BIG Session problems - blocker - our site dies - need help of experience Zope developer, please)

Thu Mar 4 16:49:39 EST 2004

On Thu, 2004-03-04 at 09:50, Casey Duncan wrote:
> It's voodoo only in the sense that it prevents you from relying on
> conflicts to bail you out. This is cleanup code, correct? If somehow,
> somebody else managed to concurrently delete a key and commit without
> provoking a read conflict in either loop, then its still safe.

I totally agree with your reasoning for this particular application, and
as I said, I'm very tempted, but in general, I think the solution is
still voodoo.  It would be great if it worked, but I'd just like to
understand why it's necessary so I don't make the same "mistake" again.

If this symptom is a case caused by the fact that two threads have
simultaneously deleted the same key, I think it's important to
understand how thread one's deletion of the key manages to influence the
database state implied by the second thread's connection without a
ReadConflictError being raised in the second thread (non-MVCC behavior)
or without the application just going merrily along its way without
raising a KeyError or a ReadConflictError in the second thread (MVCC
behavior).

>From the dawn of ZODB, at least in Zope code, we've relied on the fact
that we can write code meant to be operate against persistent objects
concurrently via multiple threads with the assumption that the database
state implied by the connection used by one thread guarantees complete
isolation from concurrent access via connections used by other threads
(whether that isolation is implemented by read conflicts or by MVCC). 
FWIW, at one point, ZODB didn't even have ReadConflictErrors, they were
a bolt-on quite late in the game (2001?) that explicitly attempted to
allow us to continue making the isolation assumption after it was proven
that app data consistency could be affected when a connection managed to
read "dirty" data out of the database.

If there is a special case for simultaneous deletions of BTree key which
results in bleed-through of database state between connections (causing
at least one thread to do a "dirty read", which would indeed explain the
failure case), that's fine, and I will be happy to work around it, I
just want to understand it and get it documented (at least in my own
head) before slapping an exception case in there.

But personally, I'm hoping that the symptom is somehow the fault of the
TemporaryStorage and that I can remain blissfully unaware of this
special case when writing future code.

> The other day I was thinking about other approaches to this gc problem
> and a possible solution based on recent changes to the ZODB ocurred to
> me:
> 
> You are basically reimplementing a kind of pack operation here, except
> it is an application pack not utilizing the underlying ZODB pack
> mechanism. But what if you could use that instead? An add() methods was
> recently added to the ZODB connection interface which allows you to add
> unreferenced objects to the database. What if transient objects were
> just unreferenced persistent objects stored in the database? Their "key"
> would simple be their oid. So a session key could be the oid of the
> transient session object. Whenever a session was accessed it would be
> marked changed in the database and its mtime would be updated.

That is a *really* cool idea.

> Periodically the transient storage would be packed to the desired
> timeout value, cleaning out any transient objects that had not been
> accessed recently enough to keep around. This pack could be done
> "in-band" because AFAIK the storage can prevent concurrent packing. It
> could also be done out-of-band. 

Right now, most (all?) ZODB pack implementations use mark and sweep,
starting at the root object, recursively unpickling each object
reachable from theand finding out other objects that are referenced by
that object.  All objects that aren't reachable are expunged.

That goes something like this (bad pseudocode):

def pack(self):
   stack = [ROOT_OB_OID]
   reachable = []

   # mark
   while stack:
      oid = stack.pop()
      pickle = self._pickles[oid]
      referenced_oids = FIND_REFERENCES_FROM(pickle)
      reachable.extend(referenced_oids)
      stack.extend(referenced_oids)

   # sweep
   for oid in self._pickles.keys():
      if not oid in reachable:
         del self._pickles[oid]

This isn't what we'd want to do in the sessioning case because it
doesn't take into account mod time, just plain referenceability from
other objects.  Instead, we'd want to do something like:

def gc(self, cutoff_time, object_type=TransientObject):
   for oid in self._pickles.keys()
      pickle = self._pickles[oid]
      if not IS_A_PICKLE_OF_THIS_KIND_OF_OBJECT(object_type, pickle):
          continue
      modtime = GET_MOD_TIME_OF(pickle)
      if modtime < cutoff_time:
         del self._pickles[oid]

This leaves all of the subobjects of the transient object in the
storage; they'd need to be removed by a normal pack.

> What I'm unsure about is whether pack would keep recent revisions to
> unreferenced objects,

It won't.  A pack will now destroy anything unreferenced.

> I'm thinking it wouldn't. Perhaps the transient
> storage could implement pack slightly differently so that it kept even
> unreferenced objects that were modified recently. In fact this might be
> useful as an optional feature for all storages, I dunno.

Right.  I wonder if we could sneak that "gc" code ("remove all objects
starting at this root that have a lesser bobobase_mod_time than x and
that are of object type y") into stock storage code.  I suspect it
wouldn't be very popular.  Maybe a more generalized callback mechanism
could be created that "plugged in" to the packing API.

Other issues:  When would the gc code be invoked?  Is it safe to invoke
the gc code from app code?  This is always a bitch.  How do we prevent a
*real* pack from hosing our sessions?

> This would eliminate having to keep track of transient objects in a
> separate persistent data structure, which would also eliminate that
> conflict hot-spot.

Yup.

> Persistent weak refs were also recently added to ZODB. These could be
> used to refer to transient objects from non-transient ones without
> affecting their lifetime.

Don't think we'd need that if we just relied on modtime and object type.

- C