[Zope-dev] RE: [ZODB-Dev] ZODB bug fix plans

Tim Peters tim@zope.com
Fri, 18 Apr 2003 14:16:22 -0400


[Jeremy Hylton]
> I've gone through the CVS logs to look for bug fixes that ought to be
> applied to the ZODB 3.1 and Zope 2.6 release branches.  There have been
> a fairly large number of bug fixes since The 3.1.1/2.6.1 release.
>
> The most important change is probably the atomic invalidation patches.
> This should address problems we've seen in the field with corrupted
> BTrees.  The problem is caused by invalidations from one transaction
> being partially applied in another database connection.
>
> I'm including a big long list of checkins that I plan to backport.  I
> may discard a few backports as I review the patches more carefully.  If
> there's anything that sounds dodgy to you, please let me know.

I backported the BTree and Bucket fixes, and will say "DONE" after the
relevant ones below:

> ----------------------------
> 2003/04/16 16:01:10 tim_one BTrees/BucketTemplate.c 1.53
> bucket_set():  Backported Guido's fix to squash nuisance warnings from
> gcc.

DONE

> ----------------------------
> 2003/04/11 19:38:18 tim_one BTrees/BucketTemplate.c 1.52
> Finished backporting bugfixes for _bucket_set().
> ----------------------------

DONE

> 2003/04/11 19:17:07 tim_one BTrees/BucketTemplate.c 1.51
> _bucket_set():  This could leave a mapping bucket in a variety of insane
> states when the value passed in was of the wrong type (for example,
> doing
>     b[obj] = 3.7
> when b is an OIBTree).  This manifested as a refcount leak in the test
> suite, but could have been much worse (most likely in real life is that
> a seemingly arbitrary existing key would "go missing").

DONE

> ----------------------------
> 2003/04/11 16:09:58 tim_one BTrees/BTreeTemplate.c 1.74
> _BTree_set():  We leaked a reference to the first key of the second
> bucket when deleting the first child of a BTree node with more than one
> child.  This caused >600 int objects to leak in the OI and OO flavors
> of testRemoveSucceeds.

DONE

> ----------------------------
> 2003/04/10 23:36:55 tim_one BTrees/BTreeTemplate.c 1.73
> BTree_deleteNextBucket():  This failed to decref the temp result from
> BTree_lastBucket().  In unusual cases, this could cause a chain of
> buckets
> to leak (the DegenerateBTree tests appeared to be the only ones that
> provoked this, and there it leaked 285 IISet buckets).  Other uses of
> BTree_lastBucket() appear to be refcount-correct.

DONE

> ----------------------------
> 2003/04/09 22:31:51 jeremy ZEO/ClientStorage.py 1.91
> Don't continue with notifyConnected() if the storage is closed.
>
> There's a race between closing the storage and shutting down the
> ConnectThread.  If we return here, the ConnectThread does no harm.
> ----------------------------
> 2003/04/09 21:26:52 jeremy ZEO/tests/ConnectionTests.py 1.21
> 2003/04/09 21:26:52 jeremy ZEO/ClientStorage.py 1.90
> Fix ZEO to work correctly with atomic invalidations.
> ----------------------------
> 2003/04/08 19:25:52 bwarsaw BDBStorage/BDBFullStorage.py 1.70
> _doabort(): Watch out for a missing pickle revision, which may be the
> case if we're aborting e.g. a transactional undo.
> ----------------------------
> 2003/04/08 18:48:22 jeremy ZODB/lock_file.py 1.9
> 2003/04/08 18:48:22 jeremy ZODB/ZApplication.py 1.13
> 2003/04/08 18:48:22 jeremy ZODB/TmpStore.py 1.10
> Cleanup __del__.
>
> You never need an __del__ to close a file.  A file closes itself when
> it is deallocated.
>
> Don't give an object a magic __del__ attribute.  It won't work with
> new-style classes, and it's obscure anyway.
> ----------------------------
> 2003/04/08 15:55:45 jeremy ZODB/tests/testZODB.py 1.10
> 2003/04/08 15:55:44 jeremy ZODB/cPickleCache.c 1.81
> 2003/04/08 15:55:44 jeremy ZODB/DB.py 1.48
> 2003/04/08 15:55:44 jeremy ZODB/Connection.py 1.88
> Backport atomic invalidations code from Zope3.
>
> The DB's invalidate() method takes a set of oids corresponding to all
> the changes from a data manager for one transaction.  All the objects
> are invalidated at once.
>
> Add a few tests in testZODB of the new code.  The tests just cover
> corner cases, because I can't think of a sensible way to test the
> atomicity.  When it has failed in the past, it's been caused by
> nearly-impossible to reproduce data races.
>
> This fix needs to be backported to Zope 2.6, but only after assessing
> how significant an impact the API change will have.
> ----------------------------
> 2003/04/07 22:26:04 jeremy ZODB/tests/testCache.py 1.13
> Sync tests with current cache impl.
> ----------------------------
> 2003/04/07 21:43:14 jeremy BTrees/BucketTemplate.c 1.50
> Add explanatory comment.
> ----------------------------
> 2003/04/04 21:41:54 jeremy BTrees/_fsBTree.c 1.6
> Shouldn't user strcmp() to compare binary strings.
>
> XXX Not sure where this macro is used or what kind of failures this
> bug caused.
> ----------------------------
> 2003/04/02 16:50:49 jeremy ZODB/cPickleCache.c 1.80
> 2003/04/02 16:50:49 jeremy ZODB/cPersistence.c 1.68
> Restore cache behavior for unreferenced by recently used objects.
>
> The implementation is to make the reference in the cache "ring" a new
> reference rather than a borrowed reference.  It is the intent of the
> cache to keep N recently used objects in memory, regardless of whether
> they are currently referenced.  The goal is to avoid the cost of
> recreating the objects, based on the assumption that they are likely
> to be used again soon.
> ----------------------------
> 2003/04/01 19:23:25 jeremy ZODB/tests/testCache.py 1.12
> Minimal fixes to make these tests work again.
>
> The cache behaves differently now; it doesn't keep objects alive
> artificially.  We should write some new tests that verify behavior
> with objects that are kept alive by external references.
> ----------------------------
> 2003/04/01 18:49:43 jeremy ZODB/cPickleCache.c 1.79
> I didn't mean to check in a call to fprintf().
> ----------------------------
> 2003/04/01 18:44:25 jeremy ZODB/cPickleCache.c 1.78
> Fix one more (last?) refcount problem with the cache.
> ----------------------------
> 2003/04/01 18:06:04 jeremy ZODB/tests/testCache.py 1.11
> Temporarily disable some tests that depended on cache implementation
> details.
> ----------------------------
> 2003/04/01 17:33:23 jeremy ZODB/cPersistence.c 1.67
> Remove Py_INCREF and Py_DECREF for objects in doubly linked list.
>
> The cache is intended to keep weak references to objects.  An object
> should be deallocated when the only references to it are in the
> cache.  If an object is unghostified and added to the list of live
> objects, the list should have a borrowed reference.  Otherwise, the
> cache will keep the object alive after all other references have
> disappeared.
> ----------------------------
> 2003/04/01 16:17:50 jeremy ZODB/cPickleCache.c 1.77
> Remove object_from_oid() which appeared to be mis-used about half the
> time.  Instead use std PyDict_GetItem() and incref only when
> necessary.
> ----------------------------
> 2003/04/01 16:08:14 jeremy ZODB/cPersistence.c 1.66
> Don't decref cache in Per_dealloc().
>
> The cache decrefed it in percachedel.
> ----------------------------
> 2003/04/01 16:02:18 bwarsaw BDBStorage/BDBFullStorage.py 1.69
> _doabort(): Instead of try/except, only zap the currentVersions entry
> when vid <> ZERO.  Also, rename a local variable.
>
> _dotxnundo(): Fix insertion of information into the currentVersions
> table.  Key should be vid and value should be revid.
>
> _collect_objs(): It's possible the vid,revid pair has already been
> deleted so catch DBNotFoundErrors.
>
> These seem to fix the last of the BDBFull test failures.
> ----------------------------
> 2003/04/01 16:01:54 jeremy ZODB/cPickleCache.c 1.76
> Only subtract one from _Py_RefTotal.
>
> PyDict_DelItem() will undo the other INCREF.
> ----------------------------
> 2003/04/01 15:51:01 jeremy ZODB/cPickleCache.c 1.75
> Reindent PyMethodDef.
> ----------------------------
> 2003/04/01 15:32:28 jeremy ZODB/cPickleCache.c 1.74
> Two small refcount improvements when deallocating a persistent object.
>
> Decref its reference to the cache.
> In a debug build, account for the two bogus additions to _Py_RefTotal.
>
> Also, reindent initcPickleCache().
> ----------------------------
> 2003/03/17 18:52:50 jeremy ZODB/FileStorage.py 1.128
> Fix one NameError and add grumbly comment about try/except that masked
> it.
> ----------------------------
> 2003/03/17 18:24:57 jeremy ZODB/DemoStorage.py 1.17
> Fix undoLog() to process arguments correctly.
> Fix edge case in pack() -- backpointer to object created in version.


> ----------------------------
> 2003/03/16 21:42:29 tim_one BTrees/BTreeItemsTemplate.c 1.19
> nextBTreeItems():  This was copying the value (from the next (key,
> value)
> pair) from the bucket into the set-iteration struct twice.  I don't
> believe this had any visible effect, it was simply pointless and wasted
> a little time (discovered by eyeball).

DONE

> ----------------------------
> 2003/03/07 00:11:10 jeremy ZODB/Transaction.py 1.48
> Make the code follow the comment.
>
> If an error occurs during tpc_finish, save the original exception just
> in case something goes wrong while cleaning up the message.
> ----------------------------
> 2003/03/04 21:00:23 jeremy ZODB/Connection.py 1.87
> Add isReadOnly() method to Connection.
> ----------------------------
> 2003/03/04 19:56:48 jeremy ZEO/ClientStorage.py 1.89
> 2003/03/04 19:56:46 jeremy ZEO/tests/ConnectionTests.py 1.20
> Fix isReadOnly() to return True for a read-only fallback connection.
> ----------------------------
> 2003/03/04 19:56:28 jeremy ZEO/tests/forker.py 1.34
> Increase the length of time we wait for a server process to start.
>
> Increase by iterating for longer, rather than making the delay
> longer.  This helps the tests go fast on faster machines.
> ----------------------------
> 2003/03/04 18:33:52 jeremy Tools/fsrefs.py 1.6
> Be more robust about unpacking errors.
> Print repr of description instead of str.
> ----------------------------
> 2003/03/03 18:02:39 gvanrossum ZEO/zrpc/client.py 1.25
> Fix a bug in _create_wrappers() according to a suggestion by Barry
> Pederson.  The 'wrap' local variable was reused in a way that
> overwrote the value intended to set the return dictionary.
> ----------------------------
> 2003/02/28 23:41:25 bwarsaw ZEO/tests/zeoserver.py 1.13
> Use 'localhost' instead of '' for the host part for Windows happiness.


> ----------------------------
> 2003/02/28 20:37:59 tim_one ZODB/winlock.c 1.9
> Backporting from Zope3/ZODB4:  exposed the Win32 UnlockFile function,
> which is used by Barry's new LockFile class.  The effect of leaving
> regions of a file locked when closing the file, or when exiting the
> process, is undefined on Windows, and the new scheme restricts itself
> to operations with defined semantics.

I'd rather not bother with this one, or the rest.  While the effects are
formally undefined, the problem this was trying to address was (in
hindsight) almost certainly caused instead by new (at the time) Berkeley
tests failing to close files.  It's possible that Barry's nice LockFile
reworks fixed other problems (on Windows) too, though -- Barry?


> ----------------------------
> 2003/02/28 19:51:21 bwarsaw BDBStorage/BerkeleyBase.py 1.41
> 2003/02/28 19:50:59 bwarsaw ZODB/FileStorage.py 1.127
> Backported from ZODB4, use the lock_file.LockFile class to
> encapsulation the storage lock file.
> ----------------------------
> 2003/02/28 19:50:06 bwarsaw ZODB/lock_file.py 1.8
> Backport lock file changes from ZODB4.  Specifically, implement an
> unlock_file() call which awaits Tim's backport of
> winlock.UnlockFile().  This is no-op on *nix.
>
> Also, implement a better API for dealing with lock files.  Use the
> LockFile class for clean acquisition and release of locks, and
> unlinking of lock files.
> ----------------------------
> 2003/02/28 19:48:18 bwarsaw BDBStorage/tests/test_storage_api.py 1.31
> FullRecoveryTest.tearDown(): Be sure to close the destination storage
> cleanly.  Backported from ZODB4.