[ZODB-Dev] ZODB design. Size and saving records

Marius Gedminas marius at gedmin.as
Thu Jul 3 19:26:26 EDT 2008


On Sun, Jun 22, 2008 at 08:54:14AM -0700, tsmiller wrote:
> 
> ZODB guys,
> 
> I have a bookstore application in which I use the ZODB in a simple way.  The
> database saves stores and records in the stores.
> I open the database as follows where the databasePath argument points to my
> bookserver.fs file.
> 
> from ZODB import FileStorage, DB
> import  transaction
> import BTrees.OOBTree,  BTrees.IOBTree
> 
>        def open(self, databasePath):
>                 self.bookDatabase_storage     =
> FileStorage.FileStorage(databasePath)

Ouch!  PEP-8 is not rocket science and it makes reading your code a more
pleasant experience.

>                 self.bookDatabase_db          =
> DB(self.bookDatabase_storage)
>                 self.bookDatabase_connection  = self.bookDatabase_db.open()
>                 self.dbRoot = self.bookDatabase_connection.root()
> 
>        # our btrees
>         ibTree = BTrees.IOBTree
> 
> I define the key 'books' to be an IOBTree
> 
>         db.dbRoot['books'] = ibTree.IOBTree()
> 
> And when the user creates a store I define the book number to be an IOBTree
> 
>         db.dbRoot['books'][storeNumber] = ibTree.IOBTree()
> 
> Then I save the book using storeNumber and bookNumber as keys.
> 
> 	record = {'title':"The Grapes of Wrath", 'author':"John
> Steinbeck",'publisher':"Randomhouse"}

Better make that

        record = PersistentDict({'title': ...})

Also: mixing tabs and spaces is a bad idea.

>         db.dbRoot['books'][storeNumber][bookNumber] = record
> 
> So now I can qualify an entire store by using
> 
> 				currentStore = db.dbRoot['books'][storeNumber]
> 
> And a single book by
> 
> 				currentBook= db.dbRoot['books'][storeNumber][bookNumber]
> 
> And look at the books in the store by
> 			  for k, v in currentStore.values():

I assume you mean 'items()' rather than 'values()'.

> 						print k, v
> 
> I have two questions.
> 		1)  When I already have a storeNumber and I save a record to 
> 
>              			db.dbRoot['books'][storeNumber][bookNumber] = record
> 
> 		 I have to set the _p_changed flag on the 'books' IOBTREE structure to get
> the book to save.
> 
> 				db.dbRoot['books']._p_changed = True

No.  Because you use IOBTree objects, they notice item additions and
removals automatically.  Changes too, if you do things like

  btree[existing_key] = new_item
  
but not changes of stored items, for example

  btree[existing_key].attribute = new_value

If you store only persisten objects (such as PersistentDict, or custom
classes inheriting from Persistent) and immutable objects (such as
strings and ints) in your database, you will never have to worry about
setting _p_changed.

> 		Which means that it saves the ENTIRE 'books' IOBTREE' structure every time
> a save a single book.

It should save the top-most 'books' IOBTree (pointlessly, because it
wasn't actually changed) and the store IOBTree only, and none of the
other store BTrees.  Also, I believe it should only store those BTree
buckets that actually changed.  But note that since you use
nonpersistent dicts, the store BTrees will have to repeatedly save them
every time you change a book in that store BTree (at least the ones in
the affected BTree bucket).

> (at least it looks like it is doing this).  When I
> edit a book and save it the database grows by more than 64k.  And it looks
> like it will get worse and worse as more books are added.

Change the records to be PersistentDicts and you'll see space savings.

> 		Am I looking at this correctly.  Or am I doing something really ignorant?

Yes: storing nonpersistent objects.

> 		2)  When I assign a value to the db such as 
> 
>              		db.dbRoot['books'][storeNumber][bookNumber] = record
> 
> 			I initially assumed that setting the _p_changed flag on the storeNumber
> key would only save the single record that I want to save.   As mentioned
> above, I am setting the flag on the 'books' IOBTREE also.  Should I have to
> set it on both.  I have come to the conclusion that the _p_changed flag must
> be set at the highest key level.  ie.. 'books'.  Again, Am I doing something
> really ignorant?  

The _p_changed should be set on the nearest enclosing persistent object,
so with the data structure you have, you'd only need to set it on

    db.dbRoot['books'][storeNumber]._p_changed = True

whenever you changed a value inside a book record

    db.dbRoot['books'][storeNumber][bookNumber]['author'] = 'Someone'

I'm only telling you this for illustration; the recommended way is to
never store nonpersistent mutable objects in the database.

> Thank you for your responses.  I really need to know and get this fixed
> before my wife divorces me!  She spends time entering books and we still
> seem to not really know when the changes are going to permanently be saved.  

Ouch.

Marius Gedminas
-- 
"Nothing ever goes missing that they don't look at me, ever since that
time I lost my horse. As if that could be helped. He was white and it
was snowing, what did they expect?"
                -- Dolorous Edd in "A Storm of Swords" by George R. R. Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: Digital signature
Url : http://mail.zope.org/pipermail/zodb-dev/attachments/20080704/cb3eb846/attachment.bin


More information about the ZODB-Dev mailing list