[ZODB-Dev] ZEO client leaking memory?

Toby Dickenson tdickenson@geminidataloggers.com
Tue, 09 Oct 2001 10:52:53 +0100


On Fri, 05 Oct 2001 10:30:13 -0400, Jim Fulton <jim@zope.com> wrote:

>Toby Dickenson wrote:
>>=20
>> Last time I looked at this my final thoughts on "the real problem" was
>> that the size limits imposed by the ZODB are too soft.
>>=20
>> the amount of work that cPickle cache performs in trying to remove old
>> objects from the cache is proportional to the amount by which the
>> actual cache size exceeds the specified limit. This means that in an
>> application that touches alot of objects, the cache will eventually
>> reach an equilibrium with a size which is roughly proportional to the
>> average number of objects touched between calls to incrGC.
>>=20
>> The code that controls this equilibrium is commented /* Decide how
>> many objects to look at */ in cPickleCache.c. At the worst case, the
>> cache has to be 10 times larger than the specified limit before
>> cPickleCache checks every one one them. I have managed to force ZODB
>> into this state; under high memory pressure the process does indeed
>> grow to roughly 10 times its size under low memory pressure.
>
>Excellent analysis.
>
>> My chosen solution at the time (which has since proven itself in
>> action) was to sprinkle more calls to incrGC within the
>> object-intensive code.
>
>What do you mean by object-intensive code?

Code that moves many objects through the ZODB cache.

>Was this application
>code?

Yes

> This probably increases the chance of "read conflicts", but
>that might not be a problem for you.

I think the problem is worse than an "increased chance". In some cases
it was possible to completely mitigate the problems of read conflicts
by touching all critical objects early in the transaction, to force
any read conflicts to be raised before starting the work that is
expensive or impossible to retry.

With my sprinkling of incrGC inside the transaction, I can never
reduce this risk to zero.

> (I really wish we had multi-version
>concurrency control.)

Yes, that would simplify alot of things.

Until then, I considered a change to the garbage collector so that it
would not deactivate any objects that had been used in the current
transaction. Im not even sure that would be an overall benefit.

>> This keeps the average number of objects
>> touched between calls to incrGC constant, and therefore memory usage
>> is constant. Fortunately individual calls to incrGC are fairly fast
>> when called frequently.
>
>I'd love to get suggestions on improving the cache management
>algorithm.

I mentioned before that the problem seems to be that the size limits
are too soft, so I think it would it be appropriate to use a strict
LRU policy for deactivating objects. I dont think it would be too
expensive to maintain a doubly-linked-list of all non-ghost objects
(one list per jar) sorted according to order of access. Every access
to a persistent object would relink that object to the front of its
list. The incremental garbage collector would work backwards through
the list, deactivating objects until *exactly* the right number
remain. Sound reasonable?



Toby Dickenson
tdickenson@geminidataloggers.com