<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

</head>

<body text="#000000" bgcolor="#ffffff">

On 05/17/2010 10:03 PM, Christopher N. Deckard wrote:<br>

<blockquote type="cite">Hello,<br>

Due to ICANN's decision to allow for non-Latin characters in domain<br>

names, I thought I'd give our Zope installation a test to see how it<br>

would handle it.  We've had problems with people copy and pasting from<br>

MS Word, and its use of strange characters.  I'm sure at some point the<br>

faculty here at Purdue will have a need to link to web sites using<br>

non-Latin characters.<br>

  <br>

This is my example HTML:<br>

  <br>

&lt;html&gt;<br>

  &lt;head&gt;<br>

    &lt;title&gt;URL Test&lt;/title&gt;<br>

  &lt;/head&gt;<br>

  &lt;body&gt;<br>

    &lt;a<br>

href=<a class="moz-txt-link-rfc2396E" href="http://EHB9.H2'1)-'D#*5'D'*.E51">"http://EHB9.H2'1)-'D#*5'D'*.E51"</a>&gt;<a class="moz-txt-link-freetext" href="http://EHB9.H2'1">http://EHB9.H2'1</a>)-'D#*5'D'*.E51&lt;/a&gt;.<br>

  &lt;/body&gt;<br>

&lt;/html&gt;<br>

  <br>

  <br>

Not only is it a different character set, but it is a right to left<br>

character set.  This code works fine while editing a Page Template.<br>

However, when viewing it, all of those characters in the href are<br>

converted to question marks.  If the same HTML is pasted into a DTML<br>

Method or a File object, Zope will convert the characters to ASCII<br>

characters.  When viewed it will be viewed correctly.<br>

  <br>

Any reason that Page Templates may fail to render this properly?<br>

  <br>

</blockquote>

It is a bug of page template. I patched them<br>

<br>

zope.pagetemplate-3.5.0-py2.6.egg/zope/pagetemplate/pagetemplate.py<br>

<div id=":44" class="ii gt"><br>

line 116     return output.getvalue()<br>

<br>

to<br>

<br>

116     # --- unicode/utf-8 hotfix ---<br>

       for idx in range(0, len(output.buflist)):<br>

           try:<br>

               output.buflist[idx] = unicode(output.buflist[idx],

'utf-8')<br>

           except (UnicodeDecodeError):<br>

               output.buflist[idx] = output.buflist[idx].decode('<wbr>utf-8',

'replace')<br>

           except:<br>

               pass<br>

<br>

       return output.getvalue()<br>

       # --- unicode/utf-8 hotfix ---</div>

<br>

<br>

_______________________________________________<br>

Zope maillist  -  <a class="moz-txt-link-abbreviated" href="mailto:Zope@zope.org">Zope@zope.org</a><br>

<a class="moz-txt-link-freetext" href="https://mail.zope.org/mailman/listinfo/zope">https://mail.zope.org/mailman/listinfo/zope</a><br>

**   No cross posts or HTML encoding!  **<br>

(Related lists - <br>

 <a class="moz-txt-link-freetext" href="https://mail.zope.org/mailman/listinfo/zope-announce">https://mail.zope.org/mailman/listinfo/zope-announce</a><br>

 <a class="moz-txt-link-freetext" href="https://mail.zope.org/mailman/listinfo/zope-dev">https://mail.zope.org/mailman/listinfo/zope-dev</a> )<br>

<br>

<br>

</body>

</html>