[Zope3-dev] i18n, unicode, and the underline

Shane Hathaway shane@zope.com
Mon, 14 Apr 2003 13:54:25 -0400


Guido van Rossum wrote:
>>From: Shane Hathaway <shane@zope.com>
>>
>>Hmm, Python does try very hard to hide the difference between ASCII 
>>strings and Unicode, so you have a good point.  What's missing is the 
>>ability to clearly distinguish between ASCII strings and binary strings. 
>>  When a function that expects only ASCII or Unicode gets a binary 
>>string, it might blow up, but not every time, and the source of the 
>>error is often hard to find.  This has caused pain for Zope 3 developers.
> 
> What were the sources of binary strings in the cases where it caused
> pain?  Were they string literals or read from a file or socket?

In the case I know about, they were in ZODB.  Jim uploaded an image, but 
Zope wouldn't serve it back.  It took a while to figure out that the 
HTTP server was trying to mix a Unicode HTTP header with binary image data.

Admittedly, since then I've figured out that we shouldn't try to send 
Unicode HTTP headers over the wire.  There is no way to specify the 
encoding for headers, AFAIK, so I think headers are limited to 7 bit ASCII.

So, everyone, aren't there any other examples of binary strings mixing 
unexpectedly with Unicode?  If not, surely the "u" prefix is unnecessary.

Shane