charlie at begeistert.org
Sun Jan 18 16:30:10 EST 2009
Am 18.01.2009 um 20:36 schrieb Dieter Maurer:
> The "Accept-Charset" request header should *never* be used
> to guess a charset at the server side:
> "Accept-Charset" is a user preference which does not know
> anything about charsets used by the server.
> If "utf-8" would not be treated with preference in the
> current code, the code base would see massive problems.
> Only the server knows which charsets it is using -- and it should
> use a single one (with very few exceptions).
> There should be a configuration option that tells this charset
> and this should be used to decode form data.
I very much appreciate that your knowledge both of the specifications
but more particularly of Zope internals is greater than mine. I am,
however, not suggesting that accept-charset be used more than it
already is by Zope for precisely the reasons you suggest.
From the current HTML specification:
"accept-charset = charset list [CI]
This attribute specifies the list of character encodings for input
data that is accepted by the server processing this form. The value is
a space- and/or comma-delimited list of charset values. The client
must interpret this list as an exclusive-or list, i.e., the server is
able to accept any single character encoding per entity received."
ie. exactly as you have suggested: it is possible to force a client to
encode data in a particular charset before sending it to the server.
All references I have come across suggest that this, together with the
meta tag content-type can and should be used to coerce browsers to use
UTF-8. On the other hand, whenever CMFDefault.utils.decode is called
the extremely unreliable getBrowserCharset() is used which will
usually return iso-8859-1. It is probably down to the way I have set
my site up but I currently have problems as a result of this when
using different browsers unless I override the default adapter.
Regarding my current configuration:
default-zpublisher-encoding = utf-8
default-charset = utf-8
All content objects are edited through formlib-derived forms and data
is stored as unicode. With a default CMF install I have not been able
to work with non-ASCII strings across OS and browser boundaries. If
possible I will try and create test cases that demonstrate the problems.
More information about the Zope-CMF