[ZODB-Dev] OOBTree: Persistent Objects as keys

Toby Dickenson tdickenson@geminidataloggers.com
Fri, 8 Feb 2002 10:53:53 +0000


On Thursday 07 February 2002 12:18 pm, Jim Fulton wrote:
>Toby Dickenson wrote:
>> On Wednesday 06 February 2002 12:27 pm, you wrote:
>> >The problem occurs when Python falls back to comparing addresses.
>> >If we could raise an error when this happens, we could avoid building
>> >meaningless BTrees. (Note that a similar problem occurs when hashes
>> >based
>> >on addresses of persistenct objects are uses in has tables.)
>>
>> I think its bigger than that. Using __cmp__ comparison also has problems
>> when mixing unicode string and 8-bit-wide strings.
>
>Let's distinguish two cases:
>
>1. Comparison raises an error, preventing a key from being used.
>
>2. Comparison is inconsistent.
>
>I consider case 1 to be a totally different class of problem than
>case 2. I find case 1 to be perfectly acceptable.

But mixing unicode and plain strings doesnt consistently raise an 
exception... it depends on the combination of the values of those strings. It 
is possible to insert and delete strings without trigerring an exception, but 
then an exception is raised when looking up one of those contained strings.  
(Hmmm. Im sure thats true, but Ive not been able to put together an example 
test case so far.... Ill maybe have some time to try again later)

I find that equally unacceptable as the simple inconsistent-comparison case. 
(but a harder problem to fix) 

> I've said it before,
>but I'll say it again. Keys in BTrees are ordered. If that doesn't work
>for you, pick another data structure. ;)

I 100% agree with this. 

>This arises from the fact that Python falls back to address comparison when
>it can't find an application-defined (IOW meaningful) comparison.
>Comparing by address is acceptable for some applications, but is totally
>inappropriate for persistent data.

....

>Just to make sure we're clear here, I'm suggesting a comparison
>excactly like Python's built-in comparison without the address comparison
>fallback. It would use whatever type-specific comparison was provided.

Ok, I misunderstood your proposal.... Python's standard __cmp__ has some 
logic which is arbitrary, but which is not based on address comparison. For 
example, all numberic types are smaller than most other stuff. Your proposed 
comparison function would raise an exception when comparing 1.0 and "hello", 
right? That sounds good.