[Zope3-dev] Re: zope.tal.xmlparser.XMLParser() dislikes unicode

Philipp von Weitershausen philipp at weitershausen.de
Sun Jan 14 08:59:35 EST 2007


Andreas Jung wrote:
> Hi,
> 
> the XMLParser.parseString() method  raises an exception
> 
>  File "/opt/python-2.4.4/lib/python2.4/unittest.py", line 260, in run
>    testMethod()
>  File 
> "/Users/ajung_data/sandboxes/Zope/Zope/lib/python/zope/tal/tests/test_xmlparser.py", 
> line 127, in test_xx
>    self._run_check(xml, ())
>  File 
> "/Users/ajung_data/sandboxes/Zope/Zope/lib/python/zope/tal/tests/test_xmlparser.py", 
> line 106, in _run_check
>    parser.parseString(source)
>  File 
> "/Users/ajung_data/sandboxes/Zope/Zope/lib/python/zope/tal/xmlparser.py", 
> line 77, in parseString
>    self.parser.Parse(s, 1)
> UnicodeEncodeError: 'ascii' codec can't encode characters in position 
> 43-48: ordinal not in range(128)
> 
> if the string to be parsed is a unicode strings and contains some non-ascii
> chars. The following snippet from a private unittest (test_xmlparsers.py)
> shows the error.
> 
>    def test_xx(self):
>        xml = unicode('<?xml version="1.0" 
> encoding="utf-8"?><foo>üöä</foo>', 'iso-8859-15')
>        self._run_check(xml, ())
> 
> I am not sure if this behavior is intentional?! Is the XMLParser supposed
> to deal with unicode strings or will it only accept a standard Python 
> string?

Traditionally, you parse an 8bit string, figure out its encoding (e.g. 
from <?xml encoding="utf-8"?> and return some representation of that XML 
with unicode data. That's why it's actually quite ok for XML parsers to 
only accept string data.

With ZPTs it's a bit different: When editing ZPTs TTW for example, we 
like to store its source in unicode. So it makes sense for us to be able 
to parse unicode input as XML.

> A workaround inside parseString() would to check for unicode
> and convert the string on-the-fly to a Python string with utf-8 encoding.
> This is possibly a limitation of the underlying Expat parser...any 
> recommendation how to deal with this issue?

Fixed it in 3.3 and trunk. If you had given me a bit more time, this 
could even have been in 2.10.2b :). Oh well, I guess that's what 2.10.2 
will be for ;)


-- 
http://worldcookery.com -- Professional Zope documentation and training
2nd edition of Web Component Development with Zope 3 is now shipping!


More information about the Zope3-dev mailing list