[ZODB-Dev] Large BTrees (WAS: IIBTree.multiunion and list comprehensions)

Mon Dec 15 16:27:41 EST 2003

Bom dia Christian,

I should have said that processing a 100000 deals BTree takes quite 
a while in Quantax (30 minutes), and this is acceptable, as we also do 
some quite involved calculations. 

My major concern and tuning is on memory, not speed (2 GB limit), and 
I spent considerable effort to call cache collections at the right 
points during the transaction (and ensuring that cached objects actually

are released, without making references to them to disappear "too fast"
- 
hence my usage of _v_attributes to store "back pointers" to weak refs - 
which I mentioned a couple of days ago). I find memory profiling is
outright
hard in Python - any hints appreciated. 

Wrt time, I noticed we spend actually quite some time of
Connection.setstate 
in actual file reading (or in Windows overhead); silly things as turning

off the Virus Checker on the DB file brought me 30% speedup...

I'm using OOBtrees with string-tuples as keys (~50 bytes) and objects of

about 15 KB as values. I measured that venerable PersistentMapping
sometimes 
is much faster, especially on first sequential reads (after the cache 
is cleared), but consumes more memory. 

So, no silver bullet, but you didn't expect that :-)

Best regards, Martin

-----Original Message-----
From: Christian Robottom Reis [mailto:kiko at async.com.br] 
Sent: Monday, 15 Dec 2003 20:01
To: Gfeller Martin
Cc: zodb-dev at zope.org
Subject: Re: [ZODB-Dev] RE: IIBTree.multiunion and list comprehensions
(Tim Peters)

On Mon, Dec 15, 2003 at 10:02:33AM +0100, Gfeller Martin wrote:
> >> I wonder what's the canonical `extreme ZODB.BTrees application', if
> >> such a thing exists.
> 
> >I indeed expect that catalogs build the biggest BTrees normally seen
in
> >practice.
> 
> B-Trees in Quantax consist of a few hundred up to 100000 (rather
> complex) objects. 

That's great to hear. Do you have any performance tips to share when
dealing with the very large ones? What sort of keys do you use in your
BTrees?

We have a fairly simple indexing setup in IndexedCatalog, but as our
first large-scale application grows towards release, I've been doing a
lot of benchmarking and profiling of the slower tasks with some fairly
large BTrees -- up to 50K items currently. Most of the objects aren't
too complex, but the profile runs still put Connection.setstate right
there at #1 (60% of the total time!) when doing our `slow-as-molasses'
test. Pushing and keeping that down is my current goal.

Take care,
--
Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 261 2331