[Zope3-dev] Re: Apache rewrite rules and URLs: an experiment
Jim Fulton
jim at zope.com
Thu Nov 4 16:04:15 EST 2004
Bjorn Tillenius wrote:
> On Thu, Nov 04, 2004 at 11:01:12AM -0500, Jim Fulton wrote:
>
>>Bjorn Tillenius wrote:
>>
>>>On Thu, Nov 04, 2004 at 09:48:56AM -0500, Jim Fulton wrote:
>>>
>>>
>>>>Peter Mayne wrote:
>>>>
>>>>
>>>>>If I try the above <tal:block> when I access Zope directly, it works.
>>>>>However, if I access it via Apache, I get:
>>>>>
>>>>>...
>>>>>File "C:\opt\Python23\Lib\site-packages\zope\tal\talinterpreter.py",
>>>>>line 451, in do_insertText_tal
>>>>> text = self.engine.evaluateText(stuff[0])
>>>>>File
>>>>>"C:\opt\Python23\Lib\site-packages\zope\app\pagetemplate\engine.py",
>>>>>line 105, in evaluateText
>>>>> return unicode(text)
>>>>>File
>>>>>"C:\opt\Python23\Lib\site-packages\zope\app\traversing\browser\absoluteur
>>>>>l.py", line 101, in __unicode__
>>>>> return urllib.unquote(self.__str__()).decode('utf-8')
>>>>>AttributeError: 'unicode' object has no attribute 'decode'
>>>>
>>>>That's odd.
>>>>
>>>>
>>>>
>>>>>I'm not even going to think about why this is happening.
>>>>
>>>>Suit yourself. Someone should think about why it's happening.
>>>
>>>
>>>I would guess that some variable that apache sets to determine the host
>>>is being represented as a unicode string.
>>
>>But it gets to Zope via HTTP, which is an ASCII subset. The publisher
>>is supposed to give all of this to Zope decoded. IOW, the input data
>>to getApplicationURL should always be unicode. I guess getApplicationURL
>>encodes. (? I don't remember the details.)
>
>
> Right, sorry, I was temporarily confused... But I still suspect that
> some of the 'host variables' are unicode, some don't. I guess that when
> virtual hosting is used, it sets some variable as a unicode string. So,
> I guess that all HTTP variables should be unicode then?
That's a good question. So, Zope gets variables as strings.
Some of these might be encoded. Like mayme server URL.
> I've looked at
> the code several times before, but haven't been able to find some
> documentation about it.
Can't help you there. :)
>
>>Hm, the interface for getApplicationURL doesn't say whether the returned
>>value is encoded. It needs to say this. The interface needs to be fixed
>>IOW.
>>
>>Given:
>>
>>- We expect a URL
>>
>>- URLs must be URL encoded
>>
>>- *Before* URL encoding, we need to utf-8 encode
>>
>>Then the output of getApplicationURL must certainly be a utf-8-url-encoded
>>string.
>
>
> Yes, that's what's happening for the path part of the URL. I guess that
> no one cared to encode the host part, since it should only contain
> ascii characters.
Is that true any more?
> I will also update the interface documentation for URL and getURL. I
> assume those should be encoded the same way as getApplicatonURL?
Yes, URLs should always be assumed to be utf-8 encoded and then url encoded,
> And while I'm at it, another thing I encountered the last time I was
> digging in the code, although I forgot to bring it up. When the raw http
> request comes to zope, it decodes the URL and stores it as unicode.
> Although it tries to decode the URL, using the charset it derives from
> the request. IMHO this is wrong, it should use utf-8 instead, shouldn't
> it?
Absolutely.
> There are at least two problems with the current approach:
>
> * No non-ascii URL is guaranteed to work on every system
>
> * Many browser, at least Opera, defaults to utf-8 for URLs
This (utf-8 encoding and then url-encoding) is specified in
an RFC (somewhere :).
Jim
--
Jim Fulton mailto:jim at zope.com Python Powered!
CTO (540) 361-1714 http://www.python.org
Zope Corporation http://www.zope.com http://www.zope.org
More information about the Zope3-dev
mailing list