[Zope-CMF] CMFSetup: non-ascii text

Florent Guillaume fg at nuxeo.com
Fri Jul 1 14:43:09 EDT 2005


On 1 Jul 2005, at 20:17, Dieter Maurer wrote:
> Florent Guillaume wrote at 2005-7-1 17:19 +0200:
>
>> In many places, CMFSetup exports and imports things like titles and
>> descriptions. For instance, for the workflow states and transitions.
>> These fields can often, outside the USA, contain non-ascii strings.
>> How do we export and reimport them ?
>>
>> 1. We can export by converting them to unicode, and the ZPT will
>> render that as UTF-8. Which charset do we assume ? Anything better
>> than "locale.getlocale()[1] or 'latin1'" ?
>>
>> 2. When importing, do we set the values (attributes) as unicode, or
>> do we try to re-convert to the above charset...
>
> I think, we should keep all text as Unicode -- even in
> english speaking environments....
>
> If this is not an option, the external format should use Unicode
> and some configuration parameter (Plones uses "site_encoding")
> converts from/to the external Unicode.

There's actually little problem for the export, we can always infer  
an encoding from somewhere, be it locale.getlocale or site_encoding  
or default_encoding or something like that, and export as UTF-8, the  
native XML encoding.

The problem is on import, you don't really know if a non-ascii string  
should be stored as unicode or as an encoded str. And the software  
may break if it gets something it doesn't expect...

Our immediate problem was with workflow transition descriptions,  
which contained accents. In that case, the DCWorkflow code does str()  
anyway so I know it's a string, but I wanted a general opinion.

I can't find the time to fix that before the next 1.5.2 release in  
any case, so I'll do that later.

Florent

-- 
Florent Guillaume, Nuxeo (Paris, France)   CTO, Director of R&D
+33 1 40 33 71 59   http://nuxeo.com   fg at nuxeo.com




More information about the Zope-CMF mailing list