[ZODB-Dev] BTrees questions
Jeremy Hylton
jeremy@zope.com
Mon, 17 Dec 2001 16:35:49 -0500 (EST)
>>>>> "MF" == Martijn Faassen <faassen@vet.uu.nl> writes:
MF> Jeremy Hylton wrote:
>> I think you should be using the BTree rather than the Bucket.
>> The BTree creates buckets to store elements as needed, but I
>> don't think you're supposed to create them yourself.
MF> I figured it was something like that, but it didn't say
MF> anywhere. Can I create sets myself, btw?
I don't know anything about sets :-(.
MF> I'm trying to use the ZODB to store and index large quantities
MF> of XML data. I'm especially interested in fast retrieval of data
MF> and I need some form of join algorithms. Here I figure the BTree
MF> module intersection etc operators can come in handy -- are they
MF> supposed to be fast?
I think that's the intent. I don't know if specific operations are
faster than others.
MF> What happens if I keep a BTree in-memory anyway? I mean, if I
MF> just create a BTree as a local variable, does the entire
MF> structure exist in memory then? And how efficient is this? :)
A BTree is a persistent object and so are each of its buckets. If you
have a large BTree and lookup a specific key, you'll only touch a very
small number of nodes or buckets. So that's very memory efficient.
If you do a cache minimize or full sweep, individual buckets can get
deactivated. So that, again, helps control memory usage.
If you can use one of the I BTrees (IO, OI, II), you'll save memory
because the int part is stored directly in the BTree instead of using
a PyObject * to a Python int object.
>> I've heard that there are docs somewhere, but I'm not sure where.
MF> There's Interfaces.py, but that's all I could find so far.
Ah, yes. I think that's it.
Jeremy