[Zope] Is data.fs corrupted?

Chris McDonough chrism@digicool.com
Thu, 9 Nov 2000 10:26:32 -0500


> Could I just butt in here and try and clarify what I've understood from
> this thread so far?  I'm feeling a bit dumb about it but I'd really like
> to sort it out.
>
> 1) This problem is definitely a programming error

I believe that, yes.

> 2) It arises when a product has a component which isn't 'registered' as
> a zope object, so it's not cleaned up when the product is deleted
>
> Is that right?  What would 'registered as a zope object' mean in this
> situation?  Are there any other circumstances under which you end up
> with orphaned instances?  *how can it be fixed?*

Zope objects basically need to inherit from OFS.SimpleItem (or an analogue)
for them to be "web manageable".  If a non-builtin Python object inherits
only from the "Persistent" class, it can be saved into the ZODB, but it will
not show up in the web interface unless you define certain methods on it to
make it act like a "SimpleItem", and unless you add it to some internal Zope
data structures that populate its ZODB container (such as the
ObjectManager's "_objects" list).  This is generally done for you by Zope
and you need not think about it.

But problems may happen when an application places instances into the ZODB
of non-web-manageable, non-builtin objects as attributes of normal Zope
objects.  If these instances aren't cleaned up in the process of the
deletion of whatever bit of Python code created them, they're "orphaned".
When they're orphaned, some application code may attempt to initialize the
object's instance (for whatever reason) in order to make use of them, and
because the ZODB can't find the class of the instance, it complains.  It
complains by raising an exception.  Zope's supposed to catch this and do the
right thing, but it evidently doesn't sometimes.

You can "fix it" by restoring whatever class the instance depends on and
subsequently use the debugger to delete the instance from the ZODB.  But
I've had it happen to me where it was just easier to truncate the
FileStorage I was working on than figure out exactly what was going on.
This is bad.

> What I really don't understand is why seemingly unrelated products fail
> as a result of the orphan instances.

Neither do I completely.  This may be bad exception handling in Zope itself.
When a Product is initialized, sometimes you'll see the failure of one
Product initialization break others.  Usually, it's the case that these
Products *are* related in some way.  It'd be interesting to investigate some
of these failures where the failure to load an object seems to break other
"unrelated" objects.  I'd imagine we'd find an unobvious relation between
the two objects.

> Is is because new products are
> trying to get an oid which happens to be used by the orphan?

No... the ZODB manages the distribution of oids, Products never need worry
about them.

> If so, I don't think it's really so wrong to call this ZODB corruption,
> although I take your point about the ZODB being stable.  In your perl
> analogy, it's as if you are using two tables to create a parent / child
> , one to many relationship.  You then deleted a parent row but forgot to
> delete the corresponding child rows.  You data is now corrupted.
> True to your analogy, this arises in sloppy programming,
> not in Oracle or whatever.  However, when you create the tables in
> Oracle, you can place constraints on the tables so enforce the integrity
> of your data, to protect against this kind of programming.  In a similar
> way, I think it's too easy to end up with corrupted data in the ZODB.
> Or perhaps I'm just particularly unlucky in the products I choose to
> use?

I think it's safe to say that you probably need to pay more attention when
writing ZODB-based apps because you don't have things like constraints to
prevent you from shooting yourself in the feet with both barrels.  :-)  That
said, it'd probably be more accurate to call the result of this family of
errors "Zope corruption" than "ZODB corruption".  Again, I don't disagree
that this distinction is really superpedantic on some level.  It would
appear that we need to work on Zope exception handling as well as tools to
detect and delete orphaned instances.  But we probably do not need to work
on "making ZODB more stable" (don't kill the messenger! :-)

> cheers,
>
> seb
>
> >
> > I'd argue that the ZODB is a very independent component of Zope, and it
> > shouldn't be blamed for this.  Devil's advocate question:  if your Perl
> > application failed because it couldn't find a record in an Oracle
database,
> > would you immediately chalk this up to database corruption and Oracle?
> >
> > We do clearly need to work on Zope tools to make it easier to find and
clear
> > "orphaned" instances in the ZODB.  We should also try to weed out the
> > programming errors which cause interdependencies of seemingly unrelated
> > components of the same Zope instance that cause failures like this.
> > Transactions are only tangentially related to this issue (I'm not sure
how
> > the "Added globals" transaction 'referred to' the "Installed product
> > DemoPortal" transaction in your example, BTW).
> >
> > BTW, I'm being sort of pedantic because when people hear "ZODB" and
> > "corruption" in the same sentence, they tend to get scared and think of
ZODB
> > as "unstable" which is really not the case... most purported "ZODB
> > corruption" issues are caused by programming errors in Products.  This
has
> > been the case for at least every one but one (the >2G pointer bug) that
I've
> > personally seen.
> >
> > ----- Original Message -----
> > From: "Bill Welch" <bill@carbonecho.com>
> > To: "Chris McDonough" <chrism@digicool.com>
> > Cc: <zope@zope.org>
> > Sent: Wednesday, November 08, 2000 2:59 PM
> > Subject: Re: [Zope] Is data.fs corrupted?
> >
> >
> > > In my case, I couldn't import DemoPortal.zexp or Wizard.zexp from PTK
> > > because oid 1377 was already in use. The pickle dump that followed
> > > contained references to ZDiscussions, which I had deleted some time
> > > before.
> > >
> > > After deleting the offending products and their directories,
restarting
> > > zope, and packing, I ran tranalyzer -r on my problem Data.fs. I found
that
> > > the problem oid was in this transaction:
> > >
> > > "/Control_Panel/Products/manage_importObject
> > >
> > > import into /var/lib/zope/var/Data.fs from
> > > /var/lib/zope/import/ZDiscussions.zexp"
> > >
> > > referred to by this transaction:
> > >
> > > "Installed product DemoPortal"
> > >
> > > in turn, referred to by this tranaction:
> > >
> > > "Added Globals"
> > >
> > > I think ZODB corruption when I see a record in one product refer to
> > > a record in an independent product and when the transaction of a
deleted
> > > product doesn't go away.
> > >
> > > On Wed, 8 Nov 2000, Chris McDonough wrote:
> > >
> > > > > I think it is ZODB corruption.
> > > >
> > > > This is very unlikely.  What makes you think this?
> > >
> > >
> > >
> >
> >
> > _______________________________________________
> > Zope maillist  -  Zope@zope.org
> > http://lists.zope.org/mailman/listinfo/zope
> > **   No cross posts or HTML encoding!  **
> > (Related lists -
> >  http://lists.zope.org/mailman/listinfo/zope-announce
> >  http://lists.zope.org/mailman/listinfo/zope-dev )
> >
>
> _______________________________________________
> Zope maillist  -  Zope@zope.org
> http://lists.zope.org/mailman/listinfo/zope
> **   No cross posts or HTML encoding!  **
> (Related lists -
>  http://lists.zope.org/mailman/listinfo/zope-announce
>  http://lists.zope.org/mailman/listinfo/zope-dev )
>
>