[ZODB-Dev] Re: What makes the ZODB slow?

Fri Jun 23 13:17:54 EDT 2006

On 23 Jun 2006, at 17:55, Andreas Jung wrote:
> --On 23. Juni 2006 17:51:35 +0200 Florent Guillaume <fg at nuxeo.com>  
> wrote:
>> BTrees perform best when keys' prefixes are randomly distributed.
>> So if your application generates keys like 'foo001', 'foo002',...  
>> you'll
>> get lots of conflicts. Same for consecutive integers in IOBTree.
>
> hm..are you sure about that?

It all depends on the concurrency for the use of these consecutive  
ids really.

The problem is bucket splits. A bucket split cannot be resolved by  
conflict resolution code of BTrees.

Let's say B is the size of a bucket and you have N leaf buckets in  
the whole BTree.
If you use consecutive ids, you'll get a bucket split every B/2  
inserts (assuming buckets are half-filled on average).
If you use random ids, you'll get a bucket split on average every N*B/ 
2 inserts.
All this roughly (I'm ignoring details like internal nodes).

If two processes concurrently use sequential ids from the same pool  
at the same time, I'd say there one in B chances of getting a  
conflict error. It's only one in (N*B)^2 if the ids are random.

All back-of-the-envelope calculations of course...

Florent

-- 
Florent Guillaume, Nuxeo (Paris, France)   Director of R&D
+33 1 40 33 71 59   http://nuxeo.com   fg at nuxeo.com