[Zope3-dev] Re: zope.tal.xmlparser.XMLParser() dislikes unicode

Dieter Maurer dieter at handshake.de
Tue Jan 16 14:06:32 EST 2007


Martijn Faassen wrote at 2007-1-15 15:44 +0100:
> ....
>Hey,
>
>On 1/15/07, Andreas Jung <lists at zopyx.com> wrote:
>[snip]
>> ok, got it. But this problem can be solved easily by changing the encoding
>> within the preamble.
>
>I would say refusing to guess and bailing out with an error message is
>better in this case.

I disagree with you.

  Logically, parsing an encoded XML document consists of two
  passes: decode the encoded string into unicode and reconstruct
  the XML info elements from the serialization.

  Traditionally, these two passes are not performed one after
  the other but folded together in a single pass.
  
  But that tradition should not prevent to separate out the
  (Unicode) decoding phase. And after this phase is done,
  there is not ambiguity left with the "XML declaration".
  Its encoding attribute is simply irrelevant for the second phase
  (apart from generating the PI info element).

  Thus, there is no guessing; someone else has just performed
  the first phase of the complete process -- maybe using the
  "encoding" attribute or some overriding information.

-- 
Dieter


More information about the Zope3-dev mailing list