[Zope-dev] Zope 2.6.0 ZMI Problem for CJK(Collector 623) patch.

Heiichiro NAKAMURA nheiich@quantumfusion.com
Thu, 12 Dec 2002 15:51:48 -0800


Thank you for the detailed comment..


On Wed, 11 Dec 2002 11:15:37 +0900
Kazuya FUKAMACHI <kf@atransia.co.jp> wrote:

> > -----------------------------------------------------------------------
> > 2) Russian patch:
> >      http://itconnection.ru/pipermail/zopyrus/2002-November/001388.html
> 
> +0.5
> 
>  i) I like such an approach.
> 
>  -  <select name="<dtml-var id>:utf8:text">
>  +  <select name="<dtml-var id>:<dtml-var management_page_charset>:text">
> 
>  i') using newly implemented function management_page_charset_default(),
>     it can set the value of default management_page_charset.
>     This avoids hard coding of default value.
> 
>  ii) but this patch may have a few troubles in Japnanese environment.
> 
>     This code returns 'eucJP' in many Japanese environment.
>       charset = locale.getlocale()[1]
>     And 
>       codecs.lookup(charset) ==> codecs.lookup('eucJP')
>     will fail, because there are no entry for 'eucJP', but 'euc-jp'
>     and 'ujis'. I think it is possible to add 'eucJP' entry to
>     JapaneseCodecs as an alias for 'euc-jp'. So, it's not a big problem.
>     (I don't know why JapaneseCodecs doesn't have 'eucJP' alias.)
> 
>     If the problem above has been solved, 
>     the value of management_page_charset maybe set to 'eucJP',
>     and it leads to another problem.
>     If management_page_charset returns 'eucJP', then header should be
>         RESPONSE.setHeader('Content-Type','text/html; charset=eucJP')
>     It is not common way as a Content-Type header.
>     We prefer
>        RESPONSE.setHeader('Content-Type','text/html; charset=EUC-JP')
> 
>     And also, it does not work in Windows environment.
>     This code returns (None, None).
>       locale.getlocale()[1]
> 


I guess the problem is the difference of char-encoding naming
convention: even among Posix-complient OSes, the naming of encodings
are vender dependent (the situation is the same among RDBMS vendors).
If I were to use Russian patch, I might put one abstraction in the
char-encoding-name handling by providing some facilities like:


def getDefaultPythonCharEncodingName():
    if os.name == 'posix':
        return charEncodingMap.get(locale.getlocale()[1], 'latin1')
    else:  # For MS Windows
       return os.environ.get('Z_CHAR_ENCODING', 'latin1')


def mapToIANA(encodingName):
    "get IANA encoding name for HTTP Header"
    return IANACharEncodingMap.get(encodingName, encodingName)


charEncodingMap = {
    'PCK': 'Shift_JIS'
    ...
}


IANACharEncodingMap = {
    'SJIS': 'Shift_JIS'
    ...
}


Sooner or later, I think this kind of mechanism will be required
for the mature support of Unicode, as Unicode brings a lot of
this kind of problems. Without the rational addressing of such issues,
the support of Unicode shouldn't be called mature I think.

Still I don't like this patch's approach very much because
this is the per-server-instance configuration, not useful
for building M17N web site.





>  iii) I guess modification to class PropertyManager seems 
>     to fix http://collector.zope.org/Zope/697
> 
>    Basically, it's interesting approach, but still needed to be brush up.
> 
> 
> > 5) Toby's proposal
> 
> I hope +1.
> I'm not satisfied with (1)-(4). 
> So, I would like to wait for Toby's implementation.


Probably it's a preferable choice.

My concern is I'm afraid if Toby is too busy to do that.
Since none of the choices(1-5) provide the perfect solution,
all of them are just a temporary patch for the urgent fix of
the severe issue (Collector 623).
So, I think it shouldn't take too much time (we shouldn't spend
too much time).



(further comments welcomed)



Regards,
---
Heiichiro NAKAMURA <nheiich@quantumfusion.com>