[ZODB-Dev] Re: Questions on ZODB BTrees versus bsddb BTrees

Florent Guillaume fg at nuxeo.com
Fri May 5 10:02:09 EDT 2006


Terry Jones wrote:
> I've just read the ZODB/ZEO Programming Guide (Release 3.6.0,
> A.M. Kuchling, January 5, 2006) and I'd like to ask a few questions.
> 
> Are the following all correct?
> 
>  - The BTrees supported by ZODB have nothing to do with Berkeley DB Btrees
>    (apart from the fact that they're both implementations of BTrees).
> 
>  - ZODB BTrees do not allow duplicate keys (a la bsddb).
> 
>  - ZODB BTrees do not allow you to set up secondary keys (a la bsddb).

Yes.

> And:
> 
>  - Is the byValue() ZODB BTree method officially deprecated? Scheduled for
>    deprecation / removal?

No idea, I've never seen it used, I didn't know it existed.

>  - Does byValue() use any kind of secondary index structure, or does it
>    walk the whole tree? I guess the latter, seeing as to implement the
>    former you'd ideally like something like a BTree allowing duplicate
>    keys...

There's no additional datastructure in any case.

> Assuming the first three above are correct, and that I would like both
> duplicate keys and secondary keys in an application with ZODB-like
> transparent persistent object storage, it looks like I have the following
> options:

I don't know what secondary keys are, and what duplicate keys imply.

>  1. Use both ZODB and bsddb in the same application, dealing with things
>     like locking and transactions across both object types in a somewhat
>     manual way.
> 
>     This would have the additional major downside of making it harder to
>     use ZEO for storage, right? (Yes, I could pickle the entire bsddb
>     databases and hand them to ZEO, but that's not really what I had in
>     mind.)
> 
>  2. Try to add support for bsddb BTrees (with duplicate keys) to ZODB. I
>     guess this is feasible, but it looks like it would make for some
>     fundamental changes, as you would no longer have the dictionary-like
>     interface to the keys of such a BTree.
> 
>     I guess this option would take a serious amount of work (which may not
>     even be wanted by the ZODB maintainers).
> 
>  3. Redesign what I'm doing so as to only need ZODB BTrees. This is quite
>     far from optimal (in terms of space and time), but might be the
>     quickest to implement.
> 
>     However, this may not solve the problem. If a fundamental need is to
>     (e.g.) find all values >= X, where duplicates are allowed but you're
>     using trees that do not allow duplicate keys, this is difficult.  Hence
>     the question about byValue().

Well I don't know what duplicate keys entail in term of API. ZODB BTrees 
have unique keys.

For the values stuff, what people usually do is have two BTrees that hey 
keep in sync, one being the reverse of the other. Finding all values >= X is 
then simply a matter of using the proper API on the reverse Tree (and it's 
fast).

Florent

> If someone with more knowledge of ZODB and the BTrees therein and bsddb
> feels like confirming, correcting, or commenting on the above summary, and
> options, that would be great.





-- 
Florent Guillaume, Nuxeo (Paris, France)   Director of R&D
+33 1 40 33 71 59   http://nuxeo.com   fg at nuxeo.com


More information about the ZODB-Dev mailing list