[ZODB-Dev] Re: Connection pool makes no sense

Thu Dec 29 15:03:42 EST 2005

[Tim]
>> ...
>> Or there are no strong reference to `obj`, but `obj` is part of cyclic
>> garbage so _continues to exist_ until a round of Python's cyclic garbage
>> collection runs.

[Dieter Maurer]
> And this is *VERY* likely as any persistent object in the cache has a
> (strong, I believe) reference to the connection which in turn references
> any of these objects indirectly via the cache.

I'm not sure I follow:  it's not just "very likely" that Connections end up
in cycles, it's certain that they do.  The small test code I posted later
should make that abundantly clear.  They end up in cycles even if they're
never used:  call DB.open(), and the Connection it returns is already in a
cycle (at least because a Connection and its cache each hold a strong
reference to the other).

> In my view, closed connections not put back into the pool

That never happens:  when an open Connection is closed, it always goes back
into the pool.  If that would cause the configured pool_size to be exceeded,
then other, older closed Connections are removed from the pool "to make
room".  It's an abuse of the system for apps even to get into that state:
that's why ZODB logs warnings if pool_size is ever exceeded, and logs at
critical level if it's exceeded "a lot".  Connections should be viewed as a
limited resource.

> should be explicitely cleaned e.g. their cache cleared or at least
> minimized.

The code that removes older Connections from the pool doesn't do that now;
it could, but there's no apparent reason to complicate it that I can see.  

> If for some reason, the garbage collector does not release the
> cache/cache content cycles, then the number of connections would grow
> unboundedly which is much worse than an unbound grow of the "all"
> attribute.

There's a big difference, though:  application code alone _could_ provoke
unbounded growth of .all without the current defensive coding -- that
doesn't require hypothesizing Python gc bugs for which there's no evidence.
If an application is seeing unbounded growth in the number of Connections,
it's a Python gc bug, a ZODB bug, or an application bug.

While cyclic gc may still seem novel to Zope2 users, it's been in Python for
over five years, and bug reports against it have been very rare -- most apps
stopped worrying about cycles years ago, and Zope3 has cycles just about
everywhere you look.  ZODB isn't a pioneer here.

I ran stress tests against ZODB a year or so ago (when the new connection
management code was implemented) that created millions of Connections, and
saw no leaks then, regardless of whether they were or weren't explicitly
closed.  That isn't part of the test suite because it tied up a machine for
a day ;-), but nothing material has changed since then that I know of.  It's
possible a new leak got introduced, but I'd need more evidence of that
before spending time on it; the small test code I posted before showed that
at least that much still works as designed, and that hit all the major paths
thru the connection mgmt code.

> Pitcher seem to observe such a situation (where for some unknown
> reason, the garbage collector does not collect the connection.

I don't believe we have any real idea what they're doing, beyond that
"something somewhere" is sticking around longer than they would like.