[Grok-dev] Tests, Unicode and Fileencoding
uli at gnufix.de
Sat Nov 17 11:50:19 EST 2007
Jan Ulrich Hasecke wrote:
> starting to use tests while developing my app,
Good move! Go ahead :-)
> I discovered that you
> can only use unicode strings in tests, when you specified the file
> encoding of the testfile.
> So my zoo.txt starts with:
> # -*- coding: utf-8 -*-
> The Online Game GrokZoo
> Is that the intended behaviour?
Though I am not very into this topic, I think it is merely the (Python)
default behaviour, not the intended behaviour.
> What is the default encoding /bin/test expects?
/bin/test does not expect a certain encoding. It only looks for tests
and runs them. This is good from my point of view, because others might
prefer other encodings than utf8. I think your test setup code is to
blame instead (at least, if you have 'borrowed' it from me).
> ASCII? So why ASCII?
Registering doctests files as unittest testsuites (your example code
looks like it), often means to call the Python standard library function
``doctest.DocFileSuite()`` in the test setup. Have a look at your test
``DocFileSuite()`` returns a ``unittest.TestSuite`` and takes the system
standard encoding as default. But you can pass an optional ``encoding``
parameter to setup the docfiles with a certain non-standard encoding.
suite = unittest.TestSuite()
for filename in DOCTESTFILES:
encoding='utf8', ## SET THE ENCODING HERE... ##
would expect your doctest files all utf8 encoded.
Marking the doctest files with `# -*- coding: utf-8 -*-` as you did,
also doesn't look like too heavy lifting to me. Interesting, that it
Did you get ``UnicodeError`` before?
> Wouldn't utf-8 be better, since we claim to have unicode everywhere
> in Zope?
There is a difference between 'having unicode' and 'everything is utf8
encoded'. Python's current internal unicode representation for example
is UCS2 or UCS4 if I remember correctly. If you meant that every input
and output from and to 'Zope' should be utf8 (or utf16), then I happily
leave this discussion to the gurus :-)
With the grok.testing extension BTW one could setup a different standard
encoding to be expected in doctest files. This could solve that little
problem for Grok. But to be honest, I don't recognize this as a real
problem and can live without it.
More information about the Grok-dev