[ZODB-Dev] Re: BTrees strangeness (was [Zope-dev] Zope 2.X BIG Session problems - blocker - our site dies - need help of experience Zope developer, please)

Casey Duncan casey at zope.com
Thu Mar 4 09:50:17 EST 2004


On Wed, 03 Mar 2004 22:36:47 -0500
Chris McDonough <chrism at plope.com> wrote:

> On Wed, 2004-03-03 at 22:20, Casey Duncan wrote:
> > > >         for key in list(self._data.keys(None, max_ts)):
> > > >             assert(key <= max_ts)
> > > >             STRICT and _assert(self._data.has_key(key))
> > > >             for v in self._data[key].values():
> > > >                 to_notify.append(v)
> > > >             del self._data[key]
> > 
> > Maybe you could use items() and two loops instead;
> > 
> > to_rm = []
> > for key, val in self._data.items(None, max_ts):
> >     for v in val.values():
> >         to_notify.append(v)
> >     to_rm.append(key)
> > for key in to_rm:
> >     try:
> >         del self._data[key]
> >     except Keyerror:
> >        pass # Somebody else deleted it first
> > 
> > I don't think that could raise a KeyError...
> 
> Well, the real bit of magic there is the "try.. except KeyError: pass"
> stanza.  Believe me, I'm tempted to stick that in, but this is the
> kind of voodoo that got me in to a lot of trouble in the older version
> of this code (there was reams upon reams of voodoo in the old code),
> so I'd really rather just figure out why the code is failing in the
> first place.  I'd just rather not mask the problem until I understand
> the cause.  That may never happen, of course, but a man can dream.

It's voodoo only in the sense that it prevents you from relying on
conflicts to bail you out. This is cleanup code, correct? If somehow,
somebody else managed to concurrently delete a key and commit without
provoking a read conflict in either loop, then its still safe.

The other day I was thinking about other approaches to this gc problem
and a possible solution based on recent changes to the ZODB ocurred to
me:

You are basically reimplementing a kind of pack operation here, except
it is an application pack not utilizing the underlying ZODB pack
mechanism. But what if you could use that instead? An add() methods was
recently added to the ZODB connection interface which allows you to add
unreferenced objects to the database. What if transient objects were
just unreferenced persistent objects stored in the database? Their "key"
would simple be their oid. So a session key could be the oid of the
transient session object. Whenever a session was accessed it would be
marked changed in the database and its mtime would be updated.

Periodically the transient storage would be packed to the desired
timeout value, cleaning out any transient objects that had not been
accessed recently enough to keep around. This pack could be done
"in-band" because AFAIK the storage can prevent concurrent packing. It
could also be done out-of-band. 

What I'm unsure about is whether pack would keep recent revisions to
unreferenced objects, I'm thinking it wouldn't. Perhaps the transient
storage could implement pack slightly differently so that it kept even
unreferenced objects that were modified recently. In fact this might be
useful as an optional feature for all storages, I dunno.

This would eliminate having to keep track of transient objects in a
separate persistent data structure, which would also eliminate that
conflict hot-spot.

Persistent weak refs were also recently added to ZODB. These could be
used to refer to transient objects from non-transient ones without
affecting their lifetime.

Thoughts?

-Casey



More information about the ZODB-Dev mailing list