[ZCM] [ZC] 2280/ 9 Comment "getPreferredCharsets() returns iso-8859-1 and not utf-8 when HTTP_ACCEPT_CHARSET not present in request"

Collector: Zope Bugs, Features, and Patches ... zope-coders-admin at zope.org
Sat May 26 21:43:02 EDT 2007


Issue #2280 Update (Comment) "getPreferredCharsets() returns iso-8859-1 and not utf-8 when HTTP_ACCEPT_CHARSET not present in request"
 Status Pending, Zope/bug+solution medium
To followup, visit:
  http://www.zope.org/Collectors/Zope/2280

==============================================================
= Comment - Entry #9 by ajung on May 26, 2007 9:43 pm

"""

I'm not sure if I'm sumitting this the right place or if this is just a local problem not affecting any other, but I have the following problem:

While using Internet Explorer 7 (IE7), the method getPreferredCharsets() in the class HTTPCharsets (http.py) returns 'iso-8859-1' and not 'utf-8' as expected. As far as I know, IE7 does not set the HTTP_ACCEPT_CHARSET in the request. Reading the source I would expect that no HTTP_ACCEPT_CHARSET should result in a return value of 'utf-8'.
"""

getPreferredCharsets() return an empty list if HTTP_ACCEPT_CHARSET is not present in the request. There is a dedicated test for this case in test_httpcharsets.py. I can't see how it can return 'iso-8859-15' or even 'utf-8'?!
________________________________________
= Resubmit - Entry #8 by tseaver on May 25, 2007 9:28 am

 Status: Rejected => Pending


________________________________________
= Comment - Entry #7 by tseaver on May 25, 2007 9:28 am

In Zope2's HTTPRequest, any key starting with 'HTTP_' will be
returned as having a default empty string value if the key
is not actually present.  The following might be a better bridge::

  header_present = bool(request.get('HTTP_ACCEPT_CHARSET'))

We'll need to add tests for this, as well.
________________________________________
= Comment - Entry #6 by Pigletto on May 25, 2007 3:38 am

I can confirm that this bug exists in Zope 2.9.6 too.
Problem appears when there is no HTTP_ACCEPT_CHARSET in request, eg. when using IE6 but in fact this is a problem with usage of statement below which is true for every string that starts with 'HTTP_':

'HTTP_ACCEPT_CHARSET' in self.request

pdb session at zope/publisher/http.py:

-> header_present = 'HTTP_ACCEPT_CHARSET' in self.request
(Pdb) l
982         def getPreferredCharsets(self):
983             '''See interface IUserPreferredCharsets'''
984             charsets = []
985             sawstar = sawiso88591 = 0
986             import pdb;pdb.set_trace()
987  ->         header_present = 'HTTP_ACCEPT_CHARSET' in self.request
988             for charset in self.request.get('HTTP_ACCEPT_CHARSET', '').split(','):
989                 charset = charset.strip().lower()
990                 if charset:
991                     if ';' in charset:
992                         charset, quality = charset.split(';')

(Pdb) p self.request['HTTP_ACCEPT_CHARSET']
''

(Pdb) 'HTTP_ACCEPT_CHARSET' in self.request
True

(Pdb) 'HTTP_ACCEPT_CHARSET' in self.request.keys()
False

(Pdb) p 'HTTP_ANYTHING' in self.request
True

(Pdb) p self.request
<HTTPRequest, URL=http://localhost:8084/snap/shot/add_sth.html>

(Pdb) p self.request.__class__
<class ZPublisher.HTTPRequest.HTTPRequest at 0x2aaaade9ae90>

(Pdb) p self.request.keys()
['-C', 'ACTUAL_URL', 'AUTHENTICATED_USER', 'AUTHENTICATION_PATH', 'BASE1', 'BASE2', 'BASE3', 'BASE4', 'BASE5', 'BASE6', 'GATEWAY_INTERFACE', 'HTTP_ACCEPT', 'HTTP_ACCEPT_ENCODING', 'HTTP_ACCEPT_LANGUAGE', 'HTTP_COOKIE', 'HTTP_HOST', 'HTTP_USER_AGENT', 'PARENTS', 'PATH_INFO', 'PATH_TRANSLATED', 'PUBLISHED', 'REMOTE_ADDR', 'REQUEST_METHOD', 'RESPONSE', 'SCRIPT_NAME', 'SERVER_NAME', 'SERVER_PORT', 'SERVER_PROTOCOL', 'SERVER_SOFTWARE', 'SERVER_URL', 'SESSION', 'TraversalRequestNameStack', 'URL', 'URL1', 'URL2', 'URL3', 'URL4', 'URL5', '_ZopeId', '__ac', 'areYourCookiesEnabled', 'disable_border']
(Pdb)
________________________________________
= Comment - Entry #5 by jost on Feb 19, 2007 5:29 am

> = Comment - Entry #4 by jost on Feb 19, 2007 4:52 am
> 
> > * sys.setdefaultencoding() is evil evil evil. Don't use that.
> If I don't change the default encoding I can't save any page templates
>   containing non-ascii characters. 

Correction: I can save page templates containing non-ascii characters, but not like this:

<tal:block tal:content="python:'æøå'"/>

________________________________________
= Comment - Entry #4 by jost on Feb 19, 2007 4:52 am

> = Comment - Entry #3 by philikon on Feb 16, 2007 11:52 am
> This might be a problem local to Zope 2, or it may not be a problem at
>   all at this point. Hard to say. Point is:
> 
> * str(self.request) is not really a solution. In fact, it's pretty weird and wrong.

How about chaning the line 

    header_present = 'HTTP_ACCEPT_CHARSET' in self.request

to

    header_present = 'HTTP_ACCEPT_CHARSET' in self.request.keys()

This seems to work correct.

> * sys.setdefaultencoding() is evil evil evil. Don't use that.
If I don't change the default encoding I can't save any page templates containing non-ascii characters. 

> We need more info to reproduce this issue. Ideally, an HTTP transcript
>   (e.g. using tcpwatch) would be best. Until then the issue remains
>   rejected.
What is it you want to look at. The request? I am pretty sure it does not contain any HTTP_ACCEPT_CHARSET statement in the header. At least it is not present when I print it out from the method in question.

Regard Jost
________________________________________
= Comment - Entry #3 by philikon on Feb 16, 2007 11:52 am

This might be a problem local to Zope 2, or it may not be a problem at all at this point. Hard to say. Point is:

* str(self.request) is not really a solution. In fact, it's pretty weird and wrong.

* type(self.request) returning <type 'instance'> is normal.

* sys.setdefaultencoding() is evil evil evil. Don't use that.

We need more info to reproduce this issue. Ideally, an HTTP transcript (e.g. using tcpwatch) would be best. Until then the issue remains rejected.

________________________________________
= Reject - Entry #2 by ajung on Feb 16, 2007 11:38 am

 Status: Pending => Rejected

This belongs into the Zope 3 bugtracker since it addresses an issues in the Zope 3 core
________________________________________
= Request - Entry #1 by jost on Feb 16, 2007 11:30 am

I'm not sure if I'm sumitting this the right place or if this is just a local problem not affecting any other, but I have the following problem:

While using Internet Explorer 7 (IE7), the method getPreferredCharsets() in the class HTTPCharsets (http.py) returns 'iso-8859-1' and not 'utf-8' as expected. As far as I know, IE7 does not set the HTTP_ACCEPT_CHARSET in the request. Reading the source I would expect that no HTTP_ACCEPT_CHARSET should result in a return value of 'utf-8'.

At least on my system the line 996 of /lib/python/zope/publisher/http.py

   header_present = 'HTTP_ACCEPT_CHARSET' in self.request

sets header_present = True, even if self.request does not contain 'HTTP_ACCEPT_CHARSET'!

Suspecting a problem with the line, not understanding why, I changed it to:

   header_present = 'HTTP_ACCEPT_CHARSET' in str(self.request)

This resolves my problem.


###################
Other charset related settings I have changed:

Have set sys.setdefaultencoding('utf-8') in /usr/local/lib/python2.4/site.py.
Have set management_page_charset='utf-8' as property of / in ZMI.
Have set default-zpublisher-encoding utf-8 in etc/zope.conf.

Adding for debug in /lib/python/zope/publisher/http.py (around line 1000):

    print type(self.request)

returns:

    <type 'instance'>




==============================================================



More information about the Zope-Collector-Monitor mailing list