[Zope3-dev] Re: zope.tal.xmlparser.XMLParser() dislikes unicode

Andreas Jung lists at zopyx.com
Mon Jan 15 07:42:00 EST 2007



--On 15. Januar 2007 13:26:16 +0100 Martijn Faassen 
<faassen at startifact.com> wrote:

>
> How would you propose to parse the following unicode string?
>
> u"<?xml version="1.0" encoding="ISO-8859-1"?><foo />"

If your parser is unicode-aware then the encoding of the preamble
does not matter since you have already unicode internally and can process 
your file totally on XML.

If your parser isn't unicode-aware then you will likely convert it to
utf-8 and work internally with utf-8 encoded strings. In fact 
xml.parsers.expat since to support unicode (it can return unicode strings
to the handlers, see 'returns_unicode' property). However you need to
reconstruct the XMl preamble when you reconstruct your XML from the
parsed data.

Or am I missing something?

Andreas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 186 bytes
Desc: not available
Url : http://mail.zope.org/pipermail/zope3-dev/attachments/20070115/660de6f7/attachment.bin


More information about the Zope3-dev mailing list