[ZODB-Dev] Spatial indices

Nitro nitro at dr-code.org
Wed Jun 16 11:45:35 EDT 2010


Hello,

I tried to find a spatial index which integrates seamlessly into the ZODB.  
Unfortunately I did not find a satisfying solution anywhere. So I came up  
with three solutions how this could be implemented:

1) Write a native r-tree package, just like the current BTrees. Would  
likely have to be written in C for performance.

2) Make use of the existing B+ Trees by using a space filling curve such  
as the Z-curve or Hilbert curve to transform higher-dimensional data into  
1D data which can then be stored in a BTree. Since B+ trees also provide  
range querying capabilities this should give good query performance.  
Unsure how much speed-up a C implementation of the insert/query functions  
would give. More info:  
http://www.scholarpedia.org/article/B-tree_and_UB-tree and  
http://www.drdobbs.com/184410998 .

3) Use the already existing Rtree package from  
http://pypi.python.org/pypi/Rtree . It's a thin wrapper of a C library, so  
it should be very fast. I can see two methods to make this work:

3a)
- Create an rtree.RTree (stored in a separate file) and an OOTreeSet.
- inserting: insert item into BTree. Then insert item's oid into Rtree.
- querying: user supplies bounding box, rtree is queried, oids are  
returned. look up objects by oid in BTree.
- zeo: does not work out-of-the-box with zeo since the Rtrees on different  
machines are not synchronized.

3b)
- Create an rtree.RTree, a OOTreeSet and an IOTree. Difference to 3a):  
Create RTree with a custom storage manager (example:  
http://svn.gispython.org/svn/spatialindex/spatialindex/trunk/src/storagemanager/MemoryStorageManager.h  
and  
http://trac.gispython.org/spatialindex/browser/spatialindex/trunk/README#storage-manager  
). This storage manager stores each page into the IOTree (key: pageId,  
value: pageData).
- inserting: insert item into BTree. Then insert item's oid into Rtree.  
Causes storage manager to write out changed rtree pages to IOTree.
- querying: user supplies bounding box, rtree is queried, pages for rtree  
returned from IOTree, oids finally returned from query. look up objects by  
oid in BTree.
- zeo: works out-of-the-box with zeo since the rtree pulls its data from a  
btree (which is hooked up with zeo).

Conclusion:

1) Native r-tree package: It is a lot of work which has already been done  
before. Bug-prone. Ruled out.
2) Spatial index on top of current BTrees: Looks interesting, could be  
done in python. Disadvantages: unclear UB tree patent situation, unclear  
how much work this really is.
3a) Does not work with zeo out-of-the-box. Ruled out.
3b) Requires writing a custom storage manager for the rtree package  
(likely in C). Provides different trees. Basic technology (rtrees +  
btrees) is tested.

Would it make sense to add a default spatial index to ZODB? Does anybody  
of you have any experience with one of the mentioned solutions? Is anybody  
else interested in having a zodb spatial index?

-Matthias


More information about the ZODB-Dev mailing list