[Zope3-dev] Make AbsoluteURL produce quoted urls

Bjorn Tillenius bjoti777 at student.liu.se
Tue Jun 1 02:09:53 EDT 2004


On Tue, Jun 01, 2004 at 12:53:01AM +0100, Stuart Bishop wrote:
> On 31/05/2004, at 10:59 PM, Bjorn Tillenius wrote:
> >On Mon, May 31, 2004 at 08:48:57AM -0400, Stephan Richter wrote:
> >>On Saturday 29 May 2004 14:03, Bjorn Tillenius wrote:
> >>>The only use case I have for AbsoluteURL is that I want to create
> >>>urls to embed into HTML pages. Therefore, I would like to make it
> >>>unicode capable and make it quote the url before it gets returned.
> >>>Right now it can only handle ascii names, which is really bad since
> >>>we allow names to be unicode.
> >>>
> >>>Any objections?
> >>
> >>Another bug that I think you have every right of fixing. :-) 
> >>Basically, you
> >>are saying that AbsoluteURL should return unicode instead of ASCII 
> >>strings,
> >>right? If so, go ahead and make the necessary changes.
> >
> >No, it should still return ASCII strings. It should encode the name to
> >utf-8 and the quote it (thus producing urls containing %xx:s). 
> >Actually,
> >now I'm sure that's the right thing to do, I will fix it tomorrow.
> 
> Shouldn't these URL's only be quoted if they are put in a
> src or href attribute? Only being able to output Unicode URL's
> as encoded ASCII rather defeats the purpose of them.

Yes, that's right. Do you have a use case for returning unicode URLs
instead of ASCII ones? If we return unicode, we have to change every use
of absolute_url in Zope3. That is, change '<a tal:attributes="href
context/@@absolute_url">' to something where you quote the URL first. I
think that justifies returning quoted ASCII URLs, if you want something
else, you should do something extra.

> <a tal:attributes="python:urlquote(context.absolute_url())"
> tal:contents="context/absolute_url" />

This won't work, you have to do something even uglier. This logic should
go into AbsoluteURL, not in the page template.

> Actually, I would have thought they would not need to be encoded
> anywhere, as the browser takes care of it. I know this is the case
> for Unicode domain names with the major browsers (see
> http://images.stuartbishop.net/idna.html for an example,
> although it is using an obscure Unicode character that is
> not present in many fonts so may look a little wonky if you
> aren't on a Mac. Should still work though.).

See: http://www.ietf.org/rfc/rfc2718.txt, Section 2.2.5
There it states the URL should be quoted, do we want to break the
standards? Also, let's say we give an object a name containing '?', this
won't work unless we quote the URL (we can give the object the name, but
we won't be able to traverse to it).

Just to clarify things, when I'm talking about names, I mean the names
of the objects being traversed. I don't think we should use idna for
non-domain names.

> Also, on the subject of quoting URL's, do you think Unicode
> domain names in a URL should be encoded using domain.encode('idna')
> or using %xx notation? I suspect IDNA. If absoluteurl returns a
> Unicode string, there will need to be a mechanism provided to
> convert it to ASCII, as it will be non trivial (since the URL will
> need to be split apart and the different components encoded
> separately). I've got a similar conversion tool available
> at http://www.stuartbishop.net/Software/EmailAddress which
> converts Unicode email addresses to ASCII.

I don't have an opinion on that, I know to little about domain names'
encodings. I certainly won't change the way it's handled today.

Regards,
  Bjorn



More information about the Zope3-dev mailing list