[Zope] Unicode/string question

Toby Dickenson tdickenson@geminidataloggers.com
Fri, 13 Sep 2002 10:26:58 +0100


On Thursday 12 Sep 2002 9:50 pm, Chris Muldrow wrote:
> I'm having unicode troubles, and I'm not sure if I'm running into a "Zo=
pe
> doesn't do that" problem or perhaps I'm just an idiot.
>
> Basically, I've an external method called xmlheadparse, which returns a
> list of lists of headlines and URLs, given a an xml file on our server.
> [[Headline here, url here],[headline2 here, url2 here]]
> I process the list into an html page of links with the following Zope
> Python script, where urlfeed is the name of the xml file I want to disp=
lay:
>
>
> returnedhtml=3D""
> storypooge=3Dcontext.xmlheadparse('/var/www/ap/'+urlfeed)
> for x in range(len(storypooge)):
>     returnedhtml=3Dreturnedhtml+'&#149 <a
> href=3D"/News/apmethods/apstory?urlfeed=3D'
>     url=3Dcontext.nntpnamestripper(storypooge[x][1])
>     headline=3Dstorypooge[x][0]
>     returnedhtml=3Dreturnedhtml+url+'">'+headline+'</a><br />\n'
> return returnedhtml
>
>
> This works fine with English text, but I also have Spanish headlines in
> some of the files. When run through this script, I get the following er=
ror:
> Error Type: UnicodeError
> Error Value: ASCII encoding error: ordinal not in range(128)
>
> The weird thing is, I can get just the unicode headline to display, but=
 not
> concatenated into the rest of the stuff. I can't seem to encode all of =
the
> pieces into the same format. What am I doing wrong?

You are mixing:
1. A unicode string
2. A plain 8-bit string with characters outside the ascii range.

Its not clear from the code fragment which strings are the unicode ones, =
and=20
which are not. I suggest you work all in unicode.... You need to convert=20
those 8 bit strings into unicode strings by applying a character encoding=
=20
using code like...

myunicodestring =3D unicode(my8bitstring,'utf-8')


=2E...substitute 'utf-8' for whatever character encoding you are using.