[Zope3-dev] i18n, unicode, and string encoding

Martijn Faassen faassen@vet.uu.nl
Tue, 15 Apr 2003 12:57:15 +0200


sathya wrote:
> I think  the string object should store its encoding  in an attribute. 
> Why ? Given a string I would like to know what its encoding is . How can 
> I do that now ?

The problem with this is that in many cases the string object would not
have a clue what encoding it is in and no way to find out.

Take for instance some line of text you read from a file. The file doesn't
specify what encoding it is in. You have to figure this out from some other
context, and it's hard to do this automatically.

You could of course require somehow that people put in the encoding
manually and then pass the encoding information around. But since you
have to specify the encoding manually anyway, you can just as well go the
whole way to unicode, after which you can forget about encodings altogether
until you do an output (and if you use utf-8 for output you don't have much
worry even then).

Regards,

Martijn