[ZODB-Dev] Indexing and dates/times

Pedro Ferreira jose.pedro.ferreira at cern.ch
Tue Jul 13 04:35:07 EDT 2010


Hello,
>> I am currently trying to devise a way to index and retrieve some
>> millions of objects according to their modification date/time. One of
>> the problems I'm facing is that of index "granularity": I'd like to
>> provide "to the second" granularity,
>>      
> will there ever be more than item with the same key?
>    

Exactly, that's the problem.

>> but for that I need some structure
>> that lets me do that. So, the options I see are:
>>   - A timestamp-based
>>      
> What do you mean by "timestamp"
>    

Well, it could be a UNIX timestamp.
>> BTree index - looks highly inefficient, as there
>> will be many entries with only one element (probably almost all of
>> them),
>>      
> I have no idea what you mean by this.
>    

That's the problem you've already mentioned above.

So, in a relational DB i would do something like:

SELECT * FROM table WHERE timestamp >= X AND timestamp <= Y

Since I cannot do this with ZODB, I'd have to have a BTree, indexed by 
timestamp... however, as you said, if I want "to the second" 
granularity, I will rarely have two items with the same key (which makes 
it pretty useless).

So, I was wondering if there is some data structure I can use for this, 
as this seems to be a pretty common use case.
The first thing that comes to my mind is a tree with different levels - 
i.e year, month,day, hour, minute... with the leaves being sets of items.

Thanks!

Pedro

-- 
José Pedro Ferreira

Indico Team

IT-UDS-AVC

513-R-0042
CERN, Geneva, Switzerland



More information about the ZODB-Dev mailing list