[ZCM] [ZC] 1037/ 7 Comment "Zope unicode patch"

Collector: Zope Bugs, Features, and Patches ... zope-coders-admin at zope.org
Wed Oct 6 11:48:59 EDT 2004


Issue #1037 Update (Comment) "Zope unicode patch"
 Status Accepted, Zope/bug+solution medium
To followup, visit:
  http://zope.org/Collectors/Zope/1037

==============================================================
= Comment - Entry #7 by EIONET on Oct 6, 2004 11:48 am

I just ran into the same problem. I use Zope 2.6.4. The problem as I see it is that I've set the management_page_charset property to UTF-8 so folder content lists etc. show correctly. But then Folder properties don't work. Looking at properties.dtml, I see that if management_page_charset is set manually, then management_page_charset_tag is NOT set - EVEN if management_page_charset is set to UTF-8! I therefore modified properties.dtml to check whether management_page_charset is set to UTF-8 (manually or by default), and IF SOset management_page_charset_tag to "UTF-8:"
________________________________________
= Comment - Entry #6 by bjorn on Apr 26, 2004 5:08 am

I think this patch has some good fixes that should be accepted, even if the whole patch isn't accepted.

htrd wrote:
> Zope mostly works fine for:
> * all text in unicode (this is preferred)
> * all text pre-encoded in one encoding (this is supported with
>   "management_page_charset")

Right, but Unicode doesn't actually work (see below).


> It doesnt work well for:
> * mixing unicode with pre-encoded text (which is likely to get more
>   common, which is why "management_page_charset" is likely to be less
>   useful long term)

Mixing pre-encoded text is too difficult.  It's easier to convert all data to Unicode, in many cases.


[...]
> 
> > 2. "manage_properties" work only in utf-8 charset 
> 
> The browser sees an html page using utf-8, but the zope server will
>   correctly trancode properties from their original encoding
>   ("management_page_charset") into utf-8.
> 
> This is necessary so that things do not break when you add a unicode
>   property type to your object. The form needs to support the Legacy
>   pre-encoded string and the unicode string.

Currently the ZMI doesn't add :ENCODING to the field names, so the ZPublisher doesn't know how to decode incoming form data (e.g., PropertyManager), resulting in Unicode errors.


> > By default it will be as active locale (-L key).
> 
> This approach is not acceptable, because it makes the behaviour of Zope
>   components dependant on global state. This is a great complication for
>   anyone developing or supporting zope *components*.

This is fine.  The current management_page_charset property does a fine job.  No need to look in the locale.  You could just ignore this part of his patch.


> >  If you dont consider it 
> > hen the work with Zope for russian Zope-developers/users will be quite
> >   >complicated. 
> 
> The current behaviour is by design, not by accident. If you *really* cant
>   use unicode, using pre-encoded strings rather than unicode is a little
>   more complicated. but not prohibitive.

Pre-encoded strings results in a mess of encode/decode statements scattered throughout the code, increasing the complexity (and the # of bugs) and decreasing the performance.  Archetypes suffers from similar problems.


> Last time someone supplied a huge patch similar to yours, they were
>   finally satisfied by a one-line bug fix to properties.dtml. Please report
>   the steps necessary to reproduce your problem and we may be able to
>   resolve it the same way, without having to change the zope unicode
>   architecture.

How about ignore the locale part of the patch?  I still think the management_page_charset_tag thingie is needed.  If possible, it'd be great if it was easy for non-ZMI (Plone etc) to call this tag.  All forms should use it.
________________________________________
= Assign - Entry #5 by htrd on Oct 13, 2003 3:44 am

 Status: Pending => Accepted

 Supporters added: htrd

> Zope from 2.6 release try to support unicode, but it make big
> problem to developers who cann't use unicode (there is many reasons).

Zope mostly works fine for:
* all text in unicode (this is preferred)
* all text pre-encoded in one encoding (this is supported with "management_page_charset")

It doesnt work well for:
* mixing unicode with pre-encoded text (which is likely to get more common, which is why "management_page_charset" is likely to be less useful long term)

Why cant you use unicode? Surely it is simply the right data *type* for the internal representation?

(fwiw, I personally see this approach as similar to using plain strings to store a decimal representation of integer properties, and then complaining that addition doest work. Just use the right type! thats what types are for! Legacy code excepted of course)

> 2. "manage_properties" work only in utf-8 charset 

The browser sees an html page using utf-8, but the zope server will correctly trancode properties from their original encoding ("management_page_charset") into utf-8.

This is necessary so that things do not break when you add a unicode property type to your object. The form needs to support the Legacy pre-encoded string and the unicode string.

> By default it will be as active locale (-L key).

This approach is not acceptable, because it makes the behaviour of Zope components dependant on global state. This is a great complication for anyone developing or supporting zope *components*.

>  If you dont consider it 
> then the work with Zope for russian Zope-developers/users will be quite >complicated. 

The current behaviour is by design, not by accident. If you *really* cant use unicode, using pre-encoded strings rather than unicode is a little more complicated. but not prohibitive.

Last time someone supplied a huge patch similar to yours, they were finally satisfied by a one-line bug fix to properties.dtml. Please report the steps necessary to reproduce your problem and we may be able to resolve it the same way, without having to change the zope unicode architecture.



________________________________________
= Comment - Entry #4 by xenru on Oct 10, 2003 11:32 am


Uploaded:  "patch-description.txt"
 - http://zope.org/Collectors/Zope/1037/patch-description.txt/view
reupload patch description (Russian)
________________________________________
= Comment - Entry #3 by xenru on Sep 12, 2003 2:03 pm


Uploaded:  "Zope-270b2-unicode.patch"
 - http://zope.org/Collectors/Zope/1037/Zope-270b2-unicode.patch/view
This is updated patch for Zope 2.7.0b2
________________________________________
= Comment - Entry #2 by xenru on Sep 3, 2003 3:18 pm


Uploaded:  "patch_description.txt"
 - http://zope.org/Collectors/Zope/1037/patch_description.txt/view
full patch description on russian (we can translate uppon request to mailbox-at-xen.ru)
________________________________________
= Request - Entry #1 by xenru on Sep 3, 2003 3:15 pm


Uploaded:  "Zope-270b1-unicode.patch"
 - http://zope.org/Collectors/Zope/1037/Zope-270b1-unicode.patch/view
Zope from 2.6 release try to support unicode, but it make big problem to developers who cann't use unicode (there is many reasons). We make patch to Zope 2.7.b1, but it also work with 2.7.b2.

What this patch do. 

1. Now Zope think that all non-unicode types (string, text, etc) use only latin-1 charset. It raise error when server encode sended data on Properties (manage_properties) page (I haven't now traceback, but will send uppon request). 

2. "manage_properties" work only in utf-8 charset that make big problems to set (and read) values in national encodings. "management_page_charset" not resolve this problem it haven't effect on "manage_properties". Patch set proper charset. By default it will be as active locale (-L key).

3. Many other bugfixes (all in description on russian). 

I'm sorry for my poor english, I would like to pay your attantion to this patch. If you dont consider it 
then the work with Zope for russian Zope-developers/users will be quite complicated.
==============================================================



More information about the Zope-Collector-Monitor mailing list