[ZODB-Dev] Relstorage and over growing database.

Jim Fulton jim at zope.com
Mon Nov 11 22:38:58 CET 2013


On Mon, Nov 11, 2013 at 4:24 PM, Daniel Widerin <daniel at widerin.net> wrote:
> Hi, just want to share our experience:
>
> My ZODB contains 300mio objects on relstorage/pgsql. The amount of
> objects is caused by btrees stored on plone dexterity contenttypes. It's
> size is 160GB. At that size it's impossible to pack because the pre-pack
> takes >100 days.
>
> jensens and me are searching for different packing algorithms and
> methods to achieve better packing performance. We're keeping you updated
> here!
>
> How i solved my problem for now:
>
> I converted into FileStorage which took about 40 hours and Data.fs was
> 55GB in size. Now i tried to run zeopack on that database - which
> succeeded and database was reduced to 7.8 GB - still containing 40mio
> objects. After that i migrated back to relstorage because of better
> performance and the result is a 11 GB db in pgsql.

Hah. Nice.  Have you measured an improvement in relstorage performance
in practice? Is it enough to justify this hassle?

WRT packaging algorithms:

- You might look at zc.FileStorage which takes a slightly different approach
  than FileStorage:

  - Does most of the packing work in a separate process to avoid the GIL.

  - Doesn't do GC.

  - Has some other optimizations I don't recall.  For our large databases,
    it's much faster than normal file-storage packing.

- Consider separating garbage collection and packing.  This allows
  garbage collection to be run mostly against a replica and to be spread
  out, if necessary.  Look at zc.zodbdgc.

> Anyone experienced similar problems packing large relstorage databases?
> The graph traversal takes a really long time. maybe we can improve that
> by storing additional information in the relational database?
>
> Any hints or comments are welcome.

Definately look at zodbdgc.  It doesn't traverse the graph. It essentially does
reference counting and is able to iterate over the database, which for
FileStorage, is relatively quick.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton


More information about the ZODB-Dev mailing list