[Zope3-dev] HTTP_ACCEPT_CHARSET header

Stuart Bishop stuart at stuartbishop.net
Wed Jun 30 09:08:07 EDT 2004


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 29/06/2004, at 7:26 PM, Philipp von Weitershausen wrote:

> Log message for revision 25999:
> Many mainstream browsers don't send an HTTP_ACCEPT_CHARSET header.
> zope.publisher uses this header to deduce the encoding of form values;
> if this header is missing though, it didn't convert them at all to 
> unicode.
>
> Since Zope's fallback is 'UTF-8' everywhere whenever an encoding is not
> specified, it should also fallback to trying to decode incoming form
> data as UTF-8.
>
> Added a test to verify this behaviour.
>
> Thanks to Marius and Bjorn for their advice.

I was wondering if just ignoring HTTP_ACCEPT_CHARSET altogether
would be the sanest approach, or at the very least using a character
set that can encode the entire Unicode space such as UTF-8 or UTF-16
if the browser says it is at all possible.

An example of when this is necessary is users pasting data into
HTML forms from other applications. The browser will send the
data in the character set the page is encoded in, and choose some
other arbitrary character that can encode it if this cannot be done.
So when I paste some text from MS-Word into that nice ISO-8859-1
form Zope3 sent me (because by browser said it would prefer it),
I get a UnicodeEncodeError because Safari helpfully sent it as
UTF-8 since “Smart Quotes” and ISO-8859-1 don't mix.

This approach also assumes that the HTTP_ACCEPT_CHARSET will not
change between requests, which nobody promises.

- --  
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (Darwin)

iD8DBQFA4rtMAfqZj7rGN0oRArXjAJ4hVIlgNrRfhNcWE6ihFpNvg5ceBACgjYbi
/WA35rh3W2frojEod+CbGK8=
=wB+u
-----END PGP SIGNATURE-----



More information about the Zope3-dev mailing list