[Zope3-dev] Re: Make AbsoluteURL produce quoted urls
Bjorn Tillenius
bjoti777 at student.liu.se
Tue Jun 1 12:33:12 EDT 2004
On Tue, Jun 01, 2004 at 02:47:45PM +0100, Stuart Bishop wrote:
> On 01/06/2004, at 1:39 PM, Philipp von Weitershausen wrote:
>
> >Bjorn Tillenius wrote:
> >>> No, I think there's a quite easy solution. AbsoluteURL.__call__
> >>should > return unicode so that one has all the options when using it
> >>from > Python. AbsoluteURL.__str__ should return whatever __call__
> >>would > return, but encoded in UTF8. If I remember correctly, TALES
> >>first > evaluates __str__ before __call__, so the path expressions
> >>would still > be fine.
> >>I'm almost fine with this, __str__ should still return ascii, it
> >>should
> >>quote the url instead (but maybe that's what you meant).
> >
> >Indeed the spec (http://www.ietf.org/rfc/rfc2718.txt, section 2.2.5)
> >suggests::
> >
> > Unless there is some compelling reason for a
> > particular scheme to do otherwise, translating character
> >sequences
> > into UTF-8 (RFC 2279) [3] and then subsequently using the %HH
> > encoding for unsafe octets is recommended.
> >
> >So, __str__ could indeed first encode to UTF-8 and then urlquote so we
> >end up with%HH. I can't come up with a good use case for wanting a
> >string but not quoted, so having either unicode or a quoted string
> >would be enough and easily implemented.
>
> It is impossible to convert a Unicode URL to an ASCII string and
> *not* have it quoted.
That's true, but nobody wanted to do that anyway. The question was
wether to return ascii or utf-8 (or another encoding).
> I would prefer the AbsoluteURL to be a subclass of unicode, so:
>
> >>> url = URL(u'http://www.ol\xe9.de/\xc7/page_\u2160.html')
> >>> unicode(url)
> u'http://www.ol\xe9.de/\xc7/page_\u2160.html'
> >>> str(url)
> 'http://www.xn--ol-cja.de/rene%C3%A9.html'
> >>> url.urlencode()
> 'http://www.xn--ol-cja.de/rene%C3%A9.html'
I don't like the urlencode method. I think an AbsoluteURL should be a
valid URL, if you want to do something special, like converting it to
unicode, you should have to do something extra. Not the other way around.
So, I want to do the following changes::
* Add __unicode__, which will of course return a unicode string.
* Change __str__ so that it takes the unicode url, encodes it to
utf-8, and urlquotes it before it gets returned.
No change will be done to __call__'s behaviour. I also won't change the
way domain names are treated, I'm not even sure it's AbsoluteURL's
responsiblity to do that.
> The last syntax was to make TALES nicer:
>
> <a tal:attributes="href someurl/urlencode" tal:content="someurl" />
>
> I prefer that to the proposed use of __call__, which would mean I
> would have to write the above as:
>
> <a tal:attributes="href someurl" tal:content="python: someurl()" />
If you wanted to that (which once again is a bad idea due to the world
not using a single encoding), you could do:
<a tal:attributes="href someurl" tal:content="someurl/__unicode__" />
Maybe it could be considered to add a 'unicode' method or something to
make it cleaner, but I won't do it.
Regards,
Bjorn
More information about the Zope3-dev
mailing list