[ZODB-Dev] OOBTree: Persistent Objects as keys

Jim Fulton jim@zope.com
Mon, 11 Feb 2002 16:00:52 -0500


Toby Dickenson wrote:
> 
> On Thursday 07 February 2002 12:18 pm, Jim Fulton wrote:
> >Toby Dickenson wrote:
> >> On Wednesday 06 February 2002 12:27 pm, you wrote:
> >> >The problem occurs when Python falls back to comparing addresses.
> >> >If we could raise an error when this happens, we could avoid building
> >> >meaningless BTrees. (Note that a similar problem occurs when hashes
> >> >based
> >> >on addresses of persistenct objects are uses in has tables.)
> >>
> >> I think its bigger than that. Using __cmp__ comparison also has problems
> >> when mixing unicode string and 8-bit-wide strings.
> >
> >Let's distinguish two cases:
> >
> >1. Comparison raises an error, preventing a key from being used.
> >
> >2. Comparison is inconsistent.
> >
> >I consider case 1 to be a totally different class of problem than
> >case 2. I find case 1 to be perfectly acceptable.
> 
> But mixing unicode and plain strings doesnt consistently raise an
> exception... it depends on the combination of the values of those strings. It
> is possible to insert and delete strings without trigerring an exception, but
> then an exception is raised when looking up one of those contained strings.
> (Hmmm. Im sure thats true, but Ive not been able to put together an example
> test case so far.... Ill maybe have some time to try again later)

We need to verify this.  My understanding that comparing strings
and unicode requires converting the strings to unicode, which will
raise an error if the strings can be converted. This seems to be
a consistent rule. I suppose that, possibly, you might hit different
strings under different conditions. Hm.

> I find that equally unacceptable as the simple inconsistent-comparison case.
> (but a harder problem to fix)
> 
> > I've said it before,
> >but I'll say it again. Keys in BTrees are ordered. If that doesn't work
> >for you, pick another data structure. ;)
> 
> I 100% agree with this.
> 
> >This arises from the fact that Python falls back to address comparison when
> >it can't find an application-defined (IOW meaningful) comparison.
> >Comparing by address is acceptable for some applications, but is totally
> >inappropriate for persistent data.
> 
> ....
> 
> >Just to make sure we're clear here, I'm suggesting a comparison
> >excactly like Python's built-in comparison without the address comparison
> >fallback. It would use whatever type-specific comparison was provided.
> 
> Ok, I misunderstood your proposal.... Python's standard __cmp__ has some
> logic which is arbitrary, but which is not based on address comparison. For
> example, all numberic types are smaller than most other stuff.

Are you sure about that?

> Your proposed
> comparison function would raise an exception when comparing 1.0 and "hello",
> right? 

Right.

> That sounds good.

OK, I'll get some help from the PythonLabs dudes on this.

Jim

--
Jim Fulton           mailto:jim@zope.com       Python Powered!        
CTO                  (888) 344-4332            http://www.python.org  
Zope Corporation     http://www.zope.com       http://www.zope.org