[ZODB-Dev] Re: Connection pool makes no sense

Tim Peters tim at zope.com
Thu Dec 29 11:28:59 EST 2005


...

[Florent Guillaume]
>> The self.all.remove(c) in _ConnectionPool attempts to destroy the
>> connection. If something else has a reference to it once it's closed,
>> then that's a bug, and it shouldn't. It should only keep a weak
>> reference to it at most.

[Pitcher at gw.tander.ru]
> But it's nonsense!

Please try to remain calm here.  It's not nonsense, but if you're screaming
too loudly you won't be able to hear :-)

> If weakref exists then some other object has ref to the obj!

Or there are no strong reference to `obj`, but `obj` is part of cyclic
garbage so _continues to exist_ until a round of Python's cyclic garbage
collection runs.

> And weakValueDictionary is cleaned up automatically when the
> last strong ref disappears.

That's a necessary precondition, but isn't necessarily sufficient.  When the
last strong reference to a value in a weakValueDictionary goes away, if that
value is part of cyclic garbage then the weakValueDictionary does not change
until Python's cyclic gc runs. 

> Destroying obj with this logic is absurd:

I covered that before, so won't repeat it.  You misunderstood the intent of
this code.

...
>     del self.data[id(obj)]         <== there is no use to delete obj by
> deleting weakref... we just deleting weakref from the weakValueDictionary!

Yes, it's just deleting the weakref -- and that's all it's trying to do, and
there are good reasons to delete the weakref here (but are not the reasons
you thought were at work here).

> Try this: 1. add this method to Connection class definition
>
> def __del__(self):
>     print 'Destruction...'
>
> then do this:

You're _really_ going to confuse yourself now ;-)  Because Connections are
always involved in reference cycles, adding a __del__ method to Connection
guarantees that Python's garbage collection will _never_ reclaim a
Connection (at least not until you explicitly break the reference cycles).

> >>> import sys
> >>> sys.path.append('/opt/Zope/lib/python')
> >>> from ZODB import Connection
> >>> c = Connection.Connection()
> >>> del(c)
> >>> c = Connection.Connection()
> >>> del(c._cache)

You're breaking a reference cycle "by hand" here, so that it becomes
_possible_ for gc to clean up the Connection.  But the only reason that was
necessary is because you added a __del__ method to begin with.

> >>> del(c)
> Destruction...
> >>>
>
> See? You can NOT delete object because _cache keeps reference to it...
> and connection remains forever!!!

That's because you added a __del__ method; it's not how Connection normally
works.  I'll give other code below illustrating this.

> It's cache has RDB connection objects and they are not closed. Connection
> becomes inaccessible and unobtainable trough the connection pool.

In your code above, `c` was never in a connection pool.  You're supposed to
get a Connection by calling DB.open(), not by instantiating Connection()
yourself (and I sure hope you're not instantiating Connection() directly in
your app!).

> That's what I wanted to say. It's definitely a BUG.

Sorry, there's no evidence of a ZODB bug here yet.

Consider this code instead.  It opens 10 Connections in the intended way
(via DB.open()), and creates a weakref with a callback to each so that we
can tell when they're reclaimed.  It then closes all the Connections, and
destroys all its strong reference to them:

"""
import weakref
import gc

import ZODB
import ZODB.FileStorage

class Wrap:
    def __init__(self, i):
        self.i = i

    def __call__(self, *args):
        print "Connection #%d went away." % self.i

N = 10
st = ZODB.FileStorage.FileStorage('blah.fs')

db = ZODB.DB(st)
cns = [db.open() for i in xrange(N)]
wrs = [weakref.ref(cn, Wrap(i)) for i, cn in enumerate(cns)]
print "closing connections"
for cn in cns:
    cn.close()
print "del'ing cns"
del cns  # destroy all our hard references
print "invoking gc"
gc.collect()
print "done"
"""

This is the output:

    closing connections
    del'ing cns
    invoking gc
    Connection #0 went away.
    Connection #1 went away.
    Connection #2 went away.
    Done

Note that "nothing happens" before Python's cyclic gc runs.  That's because
Connections are in reference cycles, and refcounting cannot reclaim objects
in trash cycles.  Because I used weakref callbacks instead of __del__
methods, cyclic gc _can_ reclaim Connections in trash cycles.

When the 10 Connections got closed, internally _ConnectionPool added them,
one at a time, to its .available queue.  When #7 was closed, the pool grew
to 8 objects, so it forgot everything it knew about the first Connection
(#0) in its queue.  "Nothing happens" then, though, because nothing _can_
happen before cyclic gc runs.  When #8 was closed, #1 got removed from
.available, and when #9 was closed, #2 got removed from .available.

When gc.collect() runs, those 3 Connections (#0, #1, and #2) are all
reclaimed.  The other 7 Connections (#3-#9) are still alive, sitting in the
.available queue waiting to be reused.



More information about the ZODB-Dev mailing list