[ZODB-Dev] BTree memory bomb

Tim Peters tim at zope.com
Tue Jan 18 20:52:22 EST 2005


[Simon Burton]
> Aha! yes, it was the len(BTree) that kept the thing in memory. Now when I
> run it (the original script without the len), the DB file quickly grows
> to 700Mb, but memory usage only gets to around 50Mb.

If that's good enough for you, it's good enough for me <wink>.

> I guess I should have mentioned, the application is for a web cache,
> which I foresee growing easily to the gigabyte range. I don't
> particularly need to know the len of it, and if I did I could store that
> in a counter. But, it was important to test useing big and distinct
> values, not just 'abc', as this does not "memory bomb".

A few things:

- Multi-gigabyte .fs files are common.  It's individual multi-gigabyte
  transactions that are rare.

- Distinct values didn't matter.  References to persistent objects
  (what Jeremy called "first class":  the type is a subclass of Persistent)
  are shared, but "second class" persistent objects (all others, like
  Python strings or integers) are stored in the database by value.  So
  in my variant of your program, a distinct 3-character "abc" string
  was stored in the database in every BTree entry.  ZODB stores a
  general rooted object graph, but there's only one incoming arc on
  each second-class persistent object.

- If you want a web cache like, say, Squid, use Squid.

- You'll eventually want to use ZEO, and storing large blobs of text
  in ZODB is problematic for several reasons, partly that using ZEO
  to transport large blobs of text across a network isn't particularly
  efficient.  I'm not sure you _do_ want to store large blobs of text,
  but if you do, schemes other than direct storage of giant strings
  should be considered.  For example, store file paths, and then you
  can naturally exploit your operating system's file caching.

> I see there are still finer issues to consider, such as index size (??)
> and Storage backend, but now at least the cache can grow much bigger than
> memory available, so that's great.

Ya, there are lots of details, but they all pale compared to avoiding
len(BTree) (which can be disastrous).  If you're going to use FileStorage
(most people do), you should find this helpful:

    http://zope.org/Wikis/ZODB/FileStorageBackup



More information about the ZODB-Dev mailing list