[Zope3-dev] i18n, unicode, and the underline

Barry Warsaw barry@python.org
11 Apr 2003 11:44:30 -0400


On Fri, 2003-04-11 at 11:29, Guido van Rossum wrote:

> Before this misinformation spreads, Barry underestimates unicode()!
> 
> Without an encoding argument, unicode(u) accepts a unicode string and
> returns it unchanged, and unicode(s) also accepts an 8-bit string and
> attempts to convert it using the ASCII encoding -- exactly what _()
> should do.

Why, I didn't know that!  BTW, using any encoding argument causes a
TypeError.  I guess that makes sense.

> Of course, concatenating u"" works exactly the same.  And according to
> timeit, it's more than twice as fast!
> 
> [guido@odiug linux]$ ./python ../Lib/timeit.py -s 's=""' 'unicode(s)'
> 1000000 loops, best of 3: 1.43 usec per loop
> [guido@odiug linux]$ ./python ../Lib/timeit.py -s 's=""' 's+u""'
> 1000000 loops, best of 3: 0.668 usec per loop

LOL!

Ok, given this, I'm fine with allowing human readable messages in Zope's
Python code be marked with _('str') when 'str' is us-ascii.  If it's
anything else, you must use _(u'ustr').  If you screw up you'll get an
exception at the earliest possible point -- a good thing.

I'd like Zope developers to get in the habit of wrapping messages in _()
all the time, so that we rarely see unmarked Unicode strings.  By
infusing this in the culture now we can make it much easier to translate
future Zope versions, products, etc.

I'll work on adding this to Zope.
-Barry