[ZODB-Dev] Connection pool makes no sense

Юдыцкий Игорь Владисла Юдыцкий Игорь Владисла
Thu Dec 29 01:32:31 EST 2005


Hi.
A little bit of history...
We have zope as an application server for heavy loaded tech process. We
have high peaks of load several times a day and my question is about how
can we can avoid unused connections to remain in memory after peak is
passed?
Before ZODB-3.4.1 connection pool was fixed size of pool_size and that
caused zope to block down while load peaks.
ZODB-3.4.2 that is shipped with Zope-2.8.5 has connection pool that does
not limit the opened connections, but tries to reduce the pool to the
pool_size and this behavior is broken IMO.

Follow my idea...
After peak load I have many (thousands of connections) that have cached
up different objects including RDB  connections. Those connections NEVER
used after that.
Why... well, because connection pool doesn't work correctly with
_reduce_size method.

    # Throw away the oldest available connections until we're under our
    # target size (strictly_less=False) or no more than that
(strictly_less=
    # True, the default).
    def _reduce_size(self, strictly_less=False):
        target = self.pool_size - bool(strictly_less)
        while len(self.available) > target:
            c = self.available.pop(0)
            self.all.remove(c) <--- Does this mean that we want to
delete connection object from the memory? If yes then why we use remove
method of weakSet object? It's nonsense.

# Same as a Set, remove obj from the collection, and raise
# KeyError if obj not in the collection.
    def remove(self, obj):
        del self.data[id(obj)] <--- This just removes weekref from the
weakValueDictionary not the object... So if we are willing to destroy
obj - we are are wrong way here...

Ok. Lets look at pop, push and repush...
    # Pop an available connection and return it, or return None if none
are
    # available.  In the latter case, the caller should create a new
    # connection, register it via push(), and call pop() again.  The
    # caller is responsible for serializing this sequence.
    def pop(self):
        result = None
        if self.available:
            result = self.available.pop()
            # Leave it in self.all, so we can still get at it for
statistics
            # while it's alive.
            assert result in self.all
        return result

    # Register a new available connection.  We must not know about c
already.
    # c will be pushed onto the available stack even if we're over the
    # pool size limit.
    def push(self, c):
        assert c not in self.all
        assert c not in self.available
        self._reduce_size(strictly_less=True)
        self.all.add(c)
        self.available.append(c)
        n, limit = len(self.all), self.pool_size
        if n > limit:
            reporter = logger.warn
            if n > 2 * limit:
                reporter = logger.critical
            reporter("DB.open() has %s open connections with a pool_size
"
                     "of %s", n, limit)

    # Reregister an available connection formerly obtained via pop().
This
    # pushes it on the stack of available connections, and may discard
    # older available connections.
    def repush(self, c):
        assert c in self.all
        assert c not in self.available
        self._reduce_size(strictly_less=True)
        self.available.append(c)

We do pop connection from self.available and do push and repush used
connection to self.available, so there is no other way to obtain
connection. Only from self.available. Good. But... look what
_reduce_size method does. It pops first connection and tries to remove
it in case of larger size of self.available list, so the connection is
not in the self.available list and nobody can obtain it any more.
Good... but it's good just in case of deletion of connection object from
the memory. But it's still there!!! and it's cache too with opened RDB
connections that will never serve anyone.

I don't know if there is some other way to return connection to the
pool. I do not know ZODB as a whole thing, but accordingly to the logic
that I can see in these pieces of code and what I see every day after
load peaks makes me believe that connection object SHOULD be deleted
with the cached objects from the memory. And this can be done definitely
not by deleting weakref. Or it seams to me as a memory leak.

I think better logic would be to have idle period along with pool_size.
We should remove oldest connection from the pool that was not used for
idle period. So we can have small pool_size and small connection pool
that can grow with site load and shrink with low site activity.

But. We can not delete connection object by del(obj) where obj is
instance of connection class. connection._cache does incref of
connection object and connection._reader does the same. So we need to
del them first.

Tell me if I'm right at this. We suffer of unused connections to RDB.
If you have any comments please include my email. Still can't join
zope-dev list.

Thanks anyway.
Igor V. Youdytsky <Pitcher at bk.ru>
Russia

--== *** ==--
Заместитель директора Департамента Информационных Технологий
Юдыцкий Игорь Владиславович


More information about the ZODB-Dev mailing list