[Zope3-Users] Re: Unicode for Stupid Americans (like me)?

Gary Poster gary at zope.com
Wed Feb 28 21:08:03 EST 2007


On Feb 28, 2007, at 5:37 PM, Paul Winkler wrote:

> On Wed, Feb 28, 2007 at 02:06:39PM -0700, Jeff Shell wrote:
>> On 2/28/07, Philipp von Weitershausen <philipp at weitershausen.de>  
>> wrote:
>>> That's sorta what zope.publisher does. Actually, it figures that  
>>> if the
>>> browser sends an Accept-Charset header, the stuff that its  
>>> sending to us
>>> would be encoded in one of those encodings, so it tries the ones in
>>> Accept-Charset until it's lucky. It falls back to UTF-8.
>>>
>>> This seems to work. But yeah, it's relying on implementation  
>>> details of
>>> the browser and it's weird.
>>
>> Ugh. I don't know how I missed that header. I was always looking  
>> for a
>> content-type on the post, hoping that it had the information.
>
> I'm rather late to this particular party, and I'm far from an expert
> on either unicode or HTTP, but I have to ask: Is it just me, or is
> HTTP's support for specifying encodings completely inadequate?
>
> As far as I can tell, there are only two relevant headers.  The
> request may specify Accept-Charset, whose meaning is given as "what
> character sets are acceptable for the *response*" (emphasis mine).
> The response may specify Content-Type, which again is irrelevant to
> the request.  If there's anything that allows the client to specify
> the encoding in use *for the request data*, I don't see it.
>
> That seems like quite an oversight to make as late as HTTP 1.1 (1999).
> What am I missing?

It's been years since I dug into this, but I'm better than 90% sure  
that the browser is expected to make its requests in the encoding of  
the response (i.e., the one set by Content-Type).  It's been too long  
for me to tell you if that's in a spec or if it is simply the de  
facto rule, though I suspect the former.

Gary




More information about the Zope3-users mailing list