[Zope-ZEO] ZODB database corruption under multiple connections

John D. Heintz jheintz@isogen.com
Thu, 18 Jan 2001 16:44:53 -0600


Jim,

Why wouldn't the transaction do any necessary sync()ing for us when it 
detects a conflict?  It would seem that the transaction object would be 
in a good place to catch the ConflictError, call sync(), and re-raise 
the exception.

John

Jim Fulton wrote:

> Lennard van der Feltz wrote:
> 
>> The script shown below illustrates an issue where it appears that
>> transactions don't rollback completely when a ConflictError occurs in one of
>> two connections.
> 
> 
> This is a correct analysis.  I've checked fixes for this into the Zope
> public CVS:
> 
>   http://www.zope.org/Resources/CVS
> 
> The file Connection.py was affected.
> 
> You can get these changes there.  The changes will be included in Zope 2.3
> (although they didn't get into Zope 2.3 beta 1).  I'm sure they'll make it
> into Andrew's distribution before long.
> 
> When objects are added to the ZODB, meta data are set on them.
> This meta-data was not cleared when a transaction was rolled back.
> This made it look to subsequent transactions like the objects had
> been added and were up to date.
> 
> Note that this error should not affect Zope applications, as 
> any new objects created in a web request are generally recreated
> when a transaction is retried.
> 
> 
>> As a result the database is corrupted,
> 
> 
> FWIW, I don't consider this corruption. The database is
> "inconsistent", with "dangling pointers" for some objects.
> The database storage integrity is not affected. While I'll
> admit that this intrpertation of the term "corrupt" may be
> arbitrary, it's a useful distinction for ZODB.  For example, 
> database corruption is typically addressed by some sort of 
> recovery procedure, but there is no recovery procedure that would
> fix this problem.
> 
> (snip)
> 
>> You'll see in the script that I found a crude hack that will hide the
>> problem,
> 
> 
> It actually addressed the problem directly by reseting some meta-data 
> indicating (slightly indirectly) that the object is still in need of
> saving.
> 
> (snip)
> 
> 
>> I would appreciate any insights into this.
> 
> 
> Hopefully, the explanation above helps.
> 
> 
>> Is it a bug that should be
>> submitted as such,
> 
> 
> I would say "yes" if I hadn't already checked in a fix. :)
> 
> 
>> or is there another explanation? Where should I look for
>> the root cause? Let me know if you are interested in seeing the other
>> scripts.
> 
> 
> Thanks for this report. The script you provided was very helpful in 
> chasing this down.
> 
> I do have some comments on the script below:
>  
> 
>> Thanks,
> 
> 
> Thank you.
>  
> 
> 
>> ##--------------------------------------------------------------------------
>> ##
>> #                   Import standard library modules
>> import sys, os, threading, time
>> sys.path.append(os.curdir) # append current director to the PYTHONPATH
>> ##--------------------------------------------------------------------------
>> ##
>> #                   Import non-standard library modules
>> import ZODB
>> from ZODB import FileStorage, PersistentList
> 
> 
> PersistentList would have caused me pain if you had used it. 
> 
> I wish Andrew would not include this in his distribution in the ZODB
> package if I don't include it in mine.
> 
> I would be reluctantly willing to include something like
> PersistentList in ZODB is I had some evidence that people actually
> needed it. Persistent sequences seem to be like persistent mappings
> and people sort of expect them to be there for symetry, but I've
> never found a use for them and never seen anyone use them, or at least
> keep using them after initial experiments.
> 
> 
> 
>> import Persistence, BTree, PersistentMapping
>> ##--------------------------------------------------------------------------
>> ##
>> #                           Program code
>> 
>> class PItem(Persistence.Persistent):
>>     pass
>> class PDict(PersistentMapping.PersistentMapping):
>>     pass
>> 
>> def getQdb(fn):
>>     fs = FileStorage.FileStorage(fn)
>>     db = ZODB.DB(fs,pool_size=7, cache_size=400)
>>     return db
>> 
>> def pack(db):
>>     try:
>>         db.pack(time.time())
>>     except:
>>         pass
>> 
>> class TestFixture:
>> 
>>     def setUp(self):
>>         self.fn = 'TData.fs'
>>         self.rrange = 100
>>         self.db = getQdb(self.fn)
>>         self.conn = self.db.open()
>> 
>>     def tearDown(self):
>>         get_transaction().commit()
>>         self.conn.close()
>>         self.conn = None
>>         pack(self.db)
>>         self.db = None
>> 
>>     def SimpleZODBLoad(self):
>>         # create container object and add to root
>>         # The problem occurs with BTree but not with a PersistentMapping
>>         q = BTree.BTree()
>>         r = self.conn.root()
>>         r['Q'] = q
>>         # write loop
>>         for i in range(self.rrange):
>>             key = 'A'+str(i)
>>             item = PItem()
>>             item.data = "this is data"
>>             q[key]=item
>>             print "\r",i,
>>         # print database size
>>         qs = len(q)
>>         print "\nSimpleZODBLoad Object Count:", qs
>> 
>>     def W_R_D_TwoConnections(self):
>>         r = self.conn.root()
>>         q = r['Q']
>>         gconn = self.db.open() # open second connection
>>         # write, read, and delete loop
>>         for i in range(self.rrange):
>>             print "\r",i,
>>             wrkey = 'B'+str(i)
>>             # make Persistent data item
>>             item = PItem()
>>             item.data = "this is data"
>>             # commit and redo transaction loop
>>             while 1:
>>                 try:
>>                     q[wrkey]=item
>>                     get_transaction().commit()
>>                 except ZODB.POSException.ConflictError:
>>                     # uncommenting the following line will compensate for
>>                     # but what other effects does it have?
> 
> 
> You should add:
> 
>                       self.conn.sync()
> 
> here. Why? Because you can get a conflict error before the transaction commit.
> You could get a ConflictError in the assignment above. If this happens, 
> the database connection will not be automatically synchronized, because, 
> concievably, the application might want to take some other action. Without
> syncing the connection, you could get an infinite loop.  I did when I
> ran your script (see below ;).
> 
> When you get a conflict error, you should either close and reopen your
> database connection, which will synchronize, or you should explicitly
> synchronize.
> 
> (Note that in the version of the software you have, the assignment wouldn't
> raise a conflict error, because conflicts were only checked during commits.
> Earlier, I checked for conflicts when reading state from the database. I
> incorrectly removed these checks because they could lead to conflict errors
> on read transactions, which seems silly.  Unfortunately, the check was necessary
> to avoid reading inconsistent data and the read checks have recently been added back.
> 
> 
>>                     continue
>>                 else:
>>                     break
>>             # read and delete an object every fifth time through the loop
>>             # using the second connection
>>             if not i % 5:
>>                 gq = gconn.root()['Q']
>>                 while 1:
>>                     try:
>>                         rdkey = gq.keys()[0]
>>                         item = gq[rdkey]
>>                         del gq[rdkey]
>>                         get_transaction().commit()
>>                     except ZODB.POSException.ConflictError:
> 
> 
>                           gconn.sync()
> 
> ditto.
> 
> 
>>                         continue
>>                     else:
>>                         break
>>                 # make sure I can access the retrieved item
>>                 d = item.data
>>         # close second connection
>>         gconn.close()
>>         # print database size
>>         qs = len(q)
>>         print "\nW_R_D_TwoConnections Object Count:", qs
>> 
>>     def ReadAndDelete(self):
>>         r = self.conn.root()
>>         q = r['Q']
>>         # print database size
>>         qs = len(q)
>>         print "\nReadAndDelete Starting Object Count:", qs
>>         # the following is rather convoluted but necessary to be able to
>> iterate
>>         # over all the elements in a BTree
>>         items = list(q.items())
>>         while items:
>>             item = items.pop(0)
>>             del q[item[0]]
>>             get_transaction().commit()
>>             # make sure I can access the retrieved item
>>             d = item[1].data
>>             print "\r",len(q),
>>         # print database size
>>         qs = len(q)
>>         print "\nReadAndDelete Ending Object Count:", qs
>> 
>> ##--------------------------------------------------------------------------
>> ##
>> #                       main program entry point
>> def main():
>>     tf = TestFixture()
>>     # do a simple batch load of objects
>>     tf.setUp()
>>     tf.SimpleZODBLoad()
>>     tf.tearDown()
>>     # write and read (and delete) objects over two connections
>>     tf.setUp()
>>     tf.W_R_D_TwoConnections()
>>     tf.tearDown()
>>     # read and delete all the objects from the database
>>     tf.setUp()
>>     tf.ReadAndDelete()
>>     tf.tearDown()
>> 
>> if __name__ == "__main__":
>>     main()
> 
> 
> 
> Jim
> 
> --
> Jim Fulton           mailto:jim@digicool.com   Python Powered!        
> Technical Director   (888) 344-4332            http://www.python.org  
> Digital Creations    http://www.digicool.com   http://www.zope.org
> 
> _______________________________________________
> Bug reports, feature requests, etc. go in the ZEO Tracker:
> http://www.zope.org/Products/ZEO/Tracker
> 
> Conversations etc. can take place in the Wiki:
> http://www.zope.org/Products/ZEO/Wiki
> 
> Zope-ZEO maillist  -  Zope-ZEO@zope.org
> http://lists.zope.org/mailman/listinfo/zope-zeo



-- 
. . . . . . . . . . . . . . . . . . . . . . . .

John D. Heintz | Senior Engineer

1016 La Posada Dr. | Suite 240 | Austin TX 78752
T 512.633.1198 | jheintz@isogen.com

w w w . d a t a c h a n n e l . c o m