[ZODB-Dev] large C extension objects / MemoryError

Andrew Dalke Andrew Dalke" <dalke@dalkescientific.com
Tue, 30 Oct 2001 17:55:06 -0700


This is a multi-part message in MIME format.

------=_NextPart_000_0036_01C1616C.027B93E0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

Me again,  :)

  I've attached a small reproducible ('make_db.py') and a support
library ('Database.py') which shows something like the problem I'm
having with a ZODB taking up a lot of memory.  They are a minimal
extraction of the code I'm working with.

It works under Linux with ZODB 2.4.1 (and should work under NT
as well).  It tries to create about 10,000 records ("Molecules").
Each Molecule takes up about 130K ==> database size of about 1.5GB.

  I have 0.5 GB of RAM and 1GB of swap under Linux.  I opened
another python session to shrink the amount of free memory by
doing 's = "S" * (1024*1024*1200)' leaving me with 260MB of
free memory.

  When I run this script, the amount of memory it uses grows
and grows.  I run out of memory around record 2400 and top shows
it takes about 260 MB (ie, it really does take all of memory).

  What do I need to do to tell ZODB to limit how much memory it
requires?  I thought the cache size setting of 400 would be good
enough, but by the end the value of cacheSize is over 4,000.

(I also don't understand why cacheSize() returns 6 before the
first commit()/pack()/minimizeCache(), then 2006 between elements
1000-2000, then 3750 after 2000.)

                    Andrew
                    dalke@dalkescientific.com


------=_NextPart_000_0036_01C1616C.027B93E0
Content-Type: text/plain;
	name="Database.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename="Database.py"

import ZODB
from ZODB import FileStorage

import Persistence
from App import Product # hack -- ensures ZODB imports things correctly
from Products.ZCatalog.Catalog import Catalog

class Molecule(Persistence.Persistent):
  def __init__(self, smiles, lookup):
    self._smiles = smiles
    self.lookup = lookup
    self.storage = Persistence.PersistentMapping()
  def __getitem__(self, name):
    return self.storage[name]
  def __setitem__(self, name, value):
    self.storage[name] = value

def create_database(filename):
  db = ZODB.DB(FileStorage.FileStorage(filename))
  connection = db.open()
  root = connection.root()
  root["database_version"] = "1"

  root["molecule_lookup"] = Persistence.PersistentMapping()
  root["storage"] = Persistence.PersistentMapping()

  cat = root["catalog"] = Catalog()
  cat.aq_parent = root

  get_transaction().commit()
  db.close()

  db = Database(filename)
  return db

class Database:
  def __init__(self, filename):
    self.db = ZODB.DB(FileStorage.FileStorage(filename))
    self.connection = self.db.open()
    self.root = self.connection.root()
    self._molecule_lookup = self.root["molecule_lookup"]
    self.storage = self.root["storage"]

  def add(self, smiles):
    mol = Molecule(smiles, self._molecule_lookup)
    self._molecule_lookup[smiles] = mol
    return mol

  def lookup(self, smiles):
    return self._molecule_lookup.lookup(smiles)

  def molecules(self):
    return self._molecule_lookup.values()


------=_NextPart_000_0036_01C1616C.027B93E0
Content-Type: text/plain;
	name="make_db.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename="make_db.py"

import time, sys
import Database

db = Database.create_database("spam.fs")
for i in range(10000):
  print i, db.db.cacheSize()
  sys.stdout.flush()

  mol = db.add("mol" + str(i))
  mol["counter"] = i
  mol["longstring"] = "X" * (128*1024) + str(i)
  if i % 1000 == 0:
    get_transaction().commit()
    db.db.pack(time.time())
    db.db.cacheMinimize(0)
    db.db.cacheFullSweep(0)

  #time.sleep(0.1)

print
get_transaction().commit()
db.db.pack(time.time())
get_transaction().commit()

------=_NextPart_000_0036_01C1616C.027B93E0--