[ZODB-Dev] large C extension objects / MemoryError

Barry A. Warsaw barry@zope.com
Fri, 26 Oct 2001 11:32:59 -0400


    AD>   - BSDDB3 does its own caching behind ZODB.  However, the
    AD> Sleepycat docs say that's only 256 KB.

By default, but you can (and probably should!) turn that number up.
Exactly what you set it at is dependent on a number of factors.
Tuning is an art, but a critical art if you're going to get good
performance out of Berkeley storage.  Here are some urls (from the
README):

    http://www.sleepycat.com/docs/ref/am_conf/cachesize.html
    http://www.sleepycat.com/docs/ref/am_misc/tune.html
    http://www.sleepycat.com/docs/ref/transapp/tune.html
    http://www.sleepycat.com/docs/ref/transapp/throughput.html

Example: on our test system, a 256MB, 733Hz PIII, changing the default
cachesize to 128MB dropped a ~6000 transaction FS->BDB migration run
by a factor of 4.

But...

>>>>> "SA" == Steve Alexander <steve@cat-box.net> writes:

    SA> Although the objects can be pickled, they won't participate in
    SA> the transaction / conflict resolution / transparent retrieval
    SA> and storage on demand systems as independent objects. They
    SA> will rely on a containing object that is Persistent for that.

    SA> Your memory error is probably because you're trying to load
    SA> lots of your large objects into memory all at once, because
    SA> you get them as a single persistent lump.

This seems like a reasonable guess, especially because you're bombing
out at the point that you're attempting to read your pickles back out
of the storage and into a Python string.  It's possible your objects
aren't participating properly in ghosting and you're just holding on
to too many big objects at the same time.

FWIW, in the above mentioned migration step, we have a number of
pickles over 10MB in size, with the largest being ~25MB IIRC.  No
problems there, but they were all "normal" objects in that they were
all Persistent.

Here's a thought: try instrumenting CommitLog.next() to record the
sizes of the objects you're committing to Berkeley.  Something like
(untested):

-------------------- snip snip --------------------
Index: CommitLog.py
===================================================================
RCS file: /cvs-repository/StandaloneZODB/bsddb3Storage/bsddb3Storage/CommitLog.py,v
retrieving revision 1.9
diff -u -r1.9 CommitLog.py
--- CommitLog.py	5 Oct 2001 19:22:06 -0000	1.9
+++ CommitLog.py	26 Oct 2001 15:28:39 -0000
@@ -336,6 +336,7 @@
         CommitLog.__init__(self, file, dir)
         self.__versions = {}
         self.__prevrevids = {}
+        self.__fp = open('/tmp/sizes.txt', 'w')
 
     def finish(self):
         CommitLog.finish(self)
@@ -398,6 +399,8 @@
             return None
         try:
             key, data = rec
+            if key == 'o':
+                print >> self.__fp, len(data[4])
         except ValueError:
             raise LogCorruptedError, 'incomplete record'
         if key not in 'ovd':
-------------------- snip snip --------------------

(you probably need to flush self.__fp too)

Then tail the file and see how big those objects are that you're
committing to the database.

If that doesn't give you an clues, the other thing you can try is to
run this with ZEO to see if it indicates a problem with your
appliation or with Berkeley DB.  Do you get the same exception with
FileStorage (+LFS)?

-Barry