[ZODB-Dev] large C extension objects / MemoryError

Andrew Dalke Andrew Dalke" <dalke@dalkescientific.com
Fri, 26 Oct 2001 11:40:36 -0600


Thanks, Steve and Barry!

Steve:
> Although the objects can be pickled, they won't participate in the 
> transaction / conflict resolution / transparent retrieval and storage
> on demand systems as independent objects. They will rely on a
> containing object that is Persistent for that.

"Who knows what magic lies in the heart of Zope ... The ZODB knows!"

I didn't think it did, but I wasn't sure.  How do I make a C
extension behave nicely?  You say it relies on a containing
object that is Persistent.  In this case, the container is a
PersistentMapping.  That doesn't seem sufficient.  What should
I try next?

Barry, re: Steve's response:
>It's possible your objects aren't participating properly in
>ghosting and you're just holding on to too many big objects at
>the same time.

I agree.  That would explain my ability to create the database
in the first place (since I'm working with other objects through
time) but not stream through the data set to compute a small
value for each one.

How might my ghosts become proper citizens?  Throw a good
Halloween party?

Barry:
>Tuning is an art, but a critical art if you're going to get good
>performance out of Berkeley storage.

Thanks for the links.  I've been staying away from that for now
since it just wanted to get it to work first before venturing
into performance.  I did just up the cache size to 2MB on the
hopes my tests would go faster, but that's all primitive art
now -- I need to start timing these runs as well.  Perhaps I
should use xdaliclock?  :)


>Here's a thought: try instrumenting CommitLog.next() to record the
>sizes of the objects you're committing to Berkeley.  Something like
>(untested):
   ...
>Then tail the file and see how big those objects are that you're
>committing to the database.

I just started a run and it will take a while to import everything.
I'm about to leave for a few hours so I'll report what I have
to date.  The biggest ones so far are (from sort -n | tail
and turning things like 1637454 into 1.6M)

0.3M
0.4M
0.6M
0.7M
0.9M
1.0M
1.1M
1.2M
1.3M
1.6M
1.6M
1.6M

>If that doesn't give you an clues,

What kind of clues should I be able to gather from this?  That
I have large fields?

>the other thing you can try is to
>run this with ZEO to see if it indicates a problem with your
>appliation or with Berkeley DB.

Ummmm.. Okay, I can try this as well.

> Do you get the same exception with FileStorage (+LFS)?

We don't have Python compiled for a large file system, so
I think FileStorage hits the 2GB limit for this case.  I'll try
again after this run finishes.  We've been using FileStorage
on smaller data sets for the last year, but up until last
month it was all from Zope-2.2.2.  No problems.  Our regression
tests work just fine with the 2.4.1 FileStorage.

                    Andrew
                    dalke@dalkescientific.com