[Zope-dev] Re: Decoding of source for text/xml ZPTs

Tres Seaver tseaver at palladion.com
Sat Oct 8 14:16:34 EDT 2005


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Withers wrote:

> During complication, the XML parser that processes non-HTML mode ZPT's
         ^
         +- "compilation", I'm guessing, but see below ;)

> decodes the string of the source into unicode instructions.
> 
> In HTML mode, the parse does no decoding and so we get string instructions.
> 
> My question as a result is: what characterset does the XML parser in
> non-HTML mode assume and can it be controlled in any way?

XML is UTF-8, unless specified in the top-level
processing-directive-like thingy the "xml declaration"), e.g.:

  <?xml version="1.0" encoding="iso-8859-1"b?>

*or* unless the transmission channel spells the encoding (the HTTP
"Content-type" header, for instance).  See Mark Pilrgrim's rant[1] on
the "insanely compilated" interactions between the Content-type header
and the document encoding.

XML files on the filesystem *must* be encoded as UTF-8, or have an
explicity encoding in the declaration.

[1] http://diveintomark.org/archives/2004/02/13/xml-media-types



Tres.
- --
===================================================================
Tres Seaver          +1 202-558-7113          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDSA0C+gerLs4ltQ4RAhj+AJ0YVYNJVCmS5Nm7aYm3LMLiq0QUjACdHZge
8S/aikU+0/ZCcBrEZu2fV70=
=0O2y
-----END PGP SIGNATURE-----



More information about the Zope-Dev mailing list