[Zope-dev] [BUG] Quadratic ZODB bloat caused by "PathIndex"

Andreas Jung Andreas Jung <lists@andreas-jung.com>
Fri, 21 Feb 2003 08:31:30 +0100


--On Donnerstag, 20. Februar 2003 08:05 +0100 Dieter Maurer 
<dieter@handshake.de> wrote:

> Zope 2.5.1
>
> A "PathIndex" maps (pathsegment,level) onto the "IISet" of document ids
> with "pathsegment" at "level" in their path.
>
> An "IISet" is a single persistent object, written as a whole to
> the ZODB. Its size is proportional to the number of entries.
> Therefore a ZODB storage with undo support grows quadratically
> with respect to the number of entries (between packs).
>
> The standard "path" index indexes based on the physical path.
> Therefore, the size of the index entry of (at least) one
> of the top level pathsegments is in the order of all indexed
> objects.
>
> Once, you have lots of indexed objects you will observe
> significant ZODB growth between packs.
>
>
> The fix would be easy: "PathIndex" should use "IITreeSet" rather
> than "IISet" to store the document id lists (as do other indexes).
> (There are more bugs in "PathIndex": e.g. it does not remove
> old index information when a new "index_object" brings in new data.
> A code review would be appropriate.)
>
>
> A quick workaround: delete the "path" index unless you really need it.
>
>

I am going to fix the problem for Zope 2.5, 2.6 and HEAD
next week.

-aj