[Zope3-dev] Weird error within __cmp__ method of KeyReferenceToPersistent

Dominik Huber dominik.huber at perse.ch
Wed Aug 2 06:03:17 EDT 2006


Hi Gary

Thanks for your inputs. I was also offline this days...

Gary Poster wrote:
[...]
> Well, first of all, I suspect your situation, based on what I've seen 
> so far, looks something like this:
>
> - code creates obj P1
> - code puts P1 in intids, with connection.
> - code does something incorrect (either your code or code that you are 
> relying on), probably involving a savepoint because of your discussion 
> below, that keeps keyreference to P1 in intids BTree, but rolls back 
> addition of P1 to connection.  Right now we have the bad data (a key 
> reference to an object without a connection in the BTree), but we 
> don't know about it yet.
> - Now code creates obj P2.
> - code tries to put P2 in intids.  If the BTree logic happens to make 
> it compare against the bad keyreference to P1, you get the __cmp__ 
> error.  This is a red herring for debugging: the problem has already 
> occurred (see above).
Yup.

Few more details:
There is a import handler which creates p1 , p2 to pn and sets those 
object to a certain folder. After each addition of a certain p an 
initializer subscriber listen to IObjectAddedEvent is invoked too. The 
error occurs sometimes during this initialization. In case of an error 
the creation and initialization of the error-prone p is rolled back by 
an Importer (compare the following code):

class Importer(object):
    def __call__(self, importhandelr)
          [...]
           for d in import_reader:
                if not d[first].startswith('#'):
                    savepoint = transaction.savepoint()
                    try:
                        data = dict([(key, value for key, value in 
d.items()])
                        importhandler(data) # <- call importhandler 
which creates the p
                        transaction.commit()

                    except Exception, e:
                        savepoint.rollback() # clean up
                        self.log.error(e)
>
> I'm pretty sure your BTree is already bad by the time you get to the 
> failing __cmp__.
Yup
> The only way I know to fix it at this late stage in the story is 
> drastic: you'll need to create a new BTree, iterate over the old one 
> and add all the contents except for the broken keyreferences, and 
> replace the old intid BTree with the new one.
>
> I don't think that's appropriate error catching for the BTree code.  I 
> think if you get the error you describe, you have a serious problem in 
> "your" code (i.e., code that you've written or code that you're 
> relying on in your stack, like Zope).  How--and when--to handle it is 
> application policy, not something that should be in the base intids, I 
> think.
Yes, but the whole problem seems not so heavy:

1. If I restart the server no wrong intid entries are there.
2. If I patch the __cmp__ method of KeyReferenceToPersistent the 
proposed way it only runs right after a wrong p creation into the 
except. After another successful transaction commit I don't have any bad 
intids entries

IMO that points that somehow the persistent data are ok. That we only 
might have an 'old' volatile reference to a roled back intid somewhere, 
that causes the error ... My problem is that I cannot the error 
situation by a test case where intids are invoked. Probably it's a 
problem of the order of invoked subscribers listen to the object added 
event.

>> The error occurs after a savepoint.rollback. I can reproduce the 
>> error within the basic error story (-> see testcase). It happens if 
>> we compare a commited persistent object with a rollbacked persistent 
>> object.
>>
>> The current implementation hurts the demeter's law. A fresh instance 
>> does not support the db attribute, even though the situation showed 
>> within the basic error story
>>
>>    >>> from persistent import Persistent
>>    >>> p = Persistent()
>>    >>> p._p_jar.db().database_name
>>    Traceback (most recent call last):
>>    ...
>>    AttributeError: 'NoneType' object has no attribute 'db'
>
> I remember knowing Demeter's law. ;-)  I should look it up, but I've 
> already spent more time than I should have this morning.  The above 
> behavior is expected.  The connection.add(obj) code in the persistent 
> keyreference code gives the obj a _p_jar.
>
>> IMO the error should still be handled by KeyReferenceToPersistent 
>> __cmp__ method.
>
> This is too late.  Anything at this stage in the game is going to be a 
> dangerous cover-up of an already serious problem.
Demeter says only long.dotted.name.calls() on references of an object 
are leading to pretty instable code.
We have this problem in the current implementation as shown in the test 
case. If we use rollback's we can produce an object state of a 
persistent that does not allow to lookup p._p_jar.db().database_name. 
That means the current implementation relies on an implicit interface 
assumption too.

   >>> transaction.commit()
   >>> savepoint = transaction.savepoint()
   >>> p2 = P()
   >>> p2.__parent__ = top
   >>> keyref2 = KeyReferenceToPersistent(p2)
   >>> cmp(keyref1, keyref2)
   -1
   >>> savepoint.rollback()
   >>> cmp(keyref1, keyref2)
   Traceback (most recent call last):
   ...
   AttributeError: 'NoneType' object has no attribute 'db'

IMO it is a pretty self-punishment-strategy to use a __cmp__ method to 
recognize corrupt intids. Your aspect of corrupt intids is *very* 
important, but removing those after an error-prone transaction would be 
much comfortable with a working __cmp__ function. So I still propose the 
following fix...

class KeyReferenceToPersistent(object):
   [...]
   def __cmp__(self, other):
       if self.key_type_id == other.key_type_id:
           try:
               return cmp(
                   (self.object._p_jar.db().database_name,  
self.object._p_oid),
                   (other.object._p_jar.db().database_name, 
other.object._p_oid),
                   )
           except AttributeError:
               # a transaction might be rolled back
               self_db = getattr(self.object._p_jar, 'db', '')
               if self_db:
                   self_db = self_db().database_name
               other_db = getattr(other.object._p_jar, 'db', '')
               if other_db:
                   other_db = other_db().database_name

               return cmp(
                   (self_db,  self.object._p_oid),
                   (other_db, other.object._p_oid),
                   )

Regards,
Dominik


More information about the Zope3-dev mailing list