[Zope3-dev] i18n, unicode, and string encoding
Guido van Rossum
guido@python.org
Mon, 14 Apr 2003 15:49:42 -0400
> Guido van Rossum wrote:
>
> >>I think the string object should store its encoding in an attribute.
> >>Why ? Given a string I would like to know what its encoding is . How can
> >>I do that now ?
> >>
> >>
> >
> >You could subclass the str type, or you can create a class that
> >contains a string and an encoding name (maybe subclassing UserString).
> >
> >But I challenge you and ask, why do you want to know its encoding?
> >You shouldn't be carrying around encoded strings. Instead, you should
> >decode strings into Unicode.
> >
> >
> Hello Guido,
> First of all I think I should ask myself If I want to take up
> challenges with you :)
> anyway I am going to let loose in the hope that I will learn somthing new.
> I can think of a byte array coming of a socket containing EBCIDIC for
> instance . I want to store this in a string type and hand it off
> to a translation function that can translate from EBCIDIC to ASCII
> .storing the encoding in the string as an attribute will help the
> translation function. At this point I am thinking why should I pass this
> as a parameter to the translation function rather than store it as an
> attribute of the string object itself. The translation function knows it
> has to produce US-ASCII, so it looks at the encoding attribute of the
> input string and figures out what to do from there.
Yes, that's the right solution. Treat it as a hot potato: decode the
EBCDIC as soon as you can.
--Guido van Rossum (home page: http://www.python.org/~guido/)