[Zope3-dev] Re: Interface declaration API

Wed, 12 Mar 2003 13:04:31 -0500

At 10:22 AM 3/12/03 -0500, Jim Fulton wrote:
>Phillip J. Eby wrote:
>>At 04:35 PM 3/11/03 -0500, Jim Fulton wrote:
>>>That's what I was thinking of. Hm. I'll have to ponder a non-descriptor
>>>implementation. It might make things cleaner if we decide change the 
>>>inheritence
>>>semantics.  I was thinking in terms of descriptors because they avoid having
>>>to sniff to decide if something is a class, which is *really really* painful
>>>in the presense of security proxies.
>>
>>I'm not sure why you need this.  Consider:
>>def directlyProvidedBy(ob):
>>     if '__provides__' in ob.__dict__:
>>         return ob.__provides__
>>     spec = ob.__provides__ = InterfaceSpecification()
>>     return spec
>>def implementedBy(klass):
>>     if '__implements__' in klass.__dict__:
>>         return klass.__implements__
>>     spec = klass.__implements__ = InterfaceSpecification() # XXX copy or 
>> merge from __bases__
>>     return spec
>>def providedBy(ob):
>>     return directlyProvidedBy(ob) + implementedBy(ob.__class__)
>>This should work fine for "plain" objects, classes, metaclasses, and 
>>"turtles all the way up".  :)  The only sticky bit is the presence of 
>>'__dict__'; but if something's __class__ doesn't have a __dict__, it's 
>>not going to be something whose ultimate metatype is 'type'.  So the 
>>above does need some try/except handling for a missing __dict__ 
>>attribute.  The except block would need to check if the non-dicted 
>>thing's __class__ has a "data descriptor" for that attribute name, and if 
>>so, use it.  This would allow __slots__ to be usable for the __provides__ 
>>and __implements__ attributes.
>>AFAICT, no part of this approach either requires or prohibits the use of 
>>descriptors.  But the __dict__ check *has* to take place (no matter how 
>>you implement this, with descriptors or not!) because you otherwise can't 
>>distinguish between an explicit and an implicit attribute value, when you 
>>don't know if what you're dealing with is a "class" in some sense.
>
>The problem is that, in the presense of security proxies, you don't have 
>access
>to the __dict__ attribute. All you have access to is attributes.
>
>Descriptors have a major benefit, in the presense of security proxies, 
>which is
>that, within the descriptor, you get access to an unproxied object.
>But, I realized since yesterday that the descriptor need not actually 
>store all
>of the information.

It's true that the descriptor needn't store everything.  In fact, it 
needn't store *anything*.  You could write descriptors that did the 
__dict__ check on an attribute of another name.  However, this wouldn't be 
helpful if the class doesn't *have* this descriptor.

>  It can walk class bases if it wan't to.  So, I may actually
>be able to avoid some of the weird hacks, unless I can't.
>
>It's time for me to start prototyping things.

I can save you (some) trouble by pointing out that descriptors will not 
help you in the situation where the class or instance never declares any 
interfaces at all:

class Foo: pass

foo = Foo()

providedBy(foo)    # will fail without ability to check the __dict__

Descriptors are only helpful if they're *there*.  If you need to have 
modify-on-query (or even delayed modification), you'll need to be able to 
*insert* the descriptor into the class.

With __dict__ manipulation, you can do this.  With a bit of trickery, you 
can even do it to classes -- even built-in types.  This is a nice extra 
benefit.

With descriptors only, you are powerless to do this to anything that 
doesn't have the descriptor already present.

Last, but not least, attribute access has some tricky semantics.  You may 
be able to rule these out as not affecting what you want to do, but given 
an object 'ob':

1) An attribute may be defined by 'ob.attr' (in instance dict or slot)
2) An attribute may be defined by 'ob.__class__' or its bases
3) An attribute may be defined by 'ob.__bases__' if ob is a class

If you don't want to special-case for whether 'ob' is a class, you can't 
use bare attribute access in the absence of a descriptor.

Also, descriptors have another peculiarity...  if you define a "data 
descriptor" (i.e. one that implements __set__), then the precedence rules 
for attribute access are *different*.  Data descriptors take precedence 
over what's in __dict__, and this can make metaclass access "weird".  Example:

class Meta(type):
    aProp = property(...)

class Class(object):
     __metaclass__ = Meta
     aProp = property(...)

ob = Class()

In the above code, accessing 'ob.aProp' will call the __get__ of 
Class.__dict__['aProp'], but accessing 'Class.aProp' will call the __get__ 
of Meta.__dict__['aProp'].  This means that you can't just write a 
'__get__' that looks to see whether it's being called on the instance or 
the class in that case.

Again, I don't know if this will impact what you have in mind; it's hazy to 
me right now whether you plan to use descriptors with __set__ or just 
__get__.  But you should be aware of this because it impacts interface 
declaration in the presence of metaclasses, if you use descriptors.

>Maybe. I think that this framework is important enough that we
>shouldn't get too hung up on complexity. Performance is important, *but*,
>I've found that to get reasonable performance, you really need to employ
>caching in ways that make the actual algorithm performance less important.

Absolutely; the PEAK framework makes heavy use of a primitive for defining 
attributes based on a lazy, cached, one-time computation.  (Think of a 
computed attribute that is computed only once.)  I already have code for 
all that in C (Pyrex, actually) that can even deal with all the metaclass 
funkiness and threadsafety.  So I have a pretty good idea of what the 
complexity is.

But, the algorithm for determining *whether* you can use the cached result 
is critical to performance, since it must be executed on every query.

>For we'll need to cache isImplementedBy calls, making the efficiencey
>of the uncached computation less important.

Actually, by having InterfaceSpecification instance simply keep a set or 
dictionary of the declared interfaces and have a __contains__ method that 
delegates to the dictionary, you wouldn't need an explicit cache:

def isImplementedBy(self,ob):
     return self in implementedBy(ob.__class__) or self in 
directlyProvidedBy(ob)

>Let's put the implementation issues aside.

Oops.  :)  I'm going to send this message anyway, because I think you 
should know about some of the peculiarities of descriptors.  I've spent a 
*lot* of time learning about them the hard way.