[Checkins] SVN: zope.mimetype/trunk/src/zope/mimetype/typegetter. Make zope.mimetype.typegetter.charsetGetter lowercase the charset name.

Marius Gedminas marius at pov.lt
Fri Sep 8 15:12:09 EDT 2006


Log message for revision 70078:
  Make zope.mimetype.typegetter.charsetGetter lowercase the charset name.
  
  I noticed that all charset names registered by zope.mimetype are converted to
  lowercase (while codec names are left alone).  Content types, on the other
  hand, may spell out the charset in uppercase.  If you try to lookup a charset
  or a charset coded with the name returned by the ICharsetGetter utility, you
  could get unexpected lookup failures.
  
  

Changed:
  U   zope.mimetype/trunk/src/zope/mimetype/typegetter.py
  U   zope.mimetype/trunk/src/zope/mimetype/typegetter.txt

-=-
Modified: zope.mimetype/trunk/src/zope/mimetype/typegetter.py
===================================================================
--- zope.mimetype/trunk/src/zope/mimetype/typegetter.py	2006-09-08 19:01:51 UTC (rev 70077)
+++ zope.mimetype/trunk/src/zope/mimetype/typegetter.py	2006-09-08 19:12:08 UTC (rev 70078)
@@ -115,7 +115,7 @@
             pass
         else:
             if params.get("charset"):
-                return params["charset"]
+                return params["charset"].lower()
     if data:
         if data.startswith(codecs.BOM_UTF16_LE):
             return 'utf-16le'

Modified: zope.mimetype/trunk/src/zope/mimetype/typegetter.txt
===================================================================
--- zope.mimetype/trunk/src/zope/mimetype/typegetter.txt	2006-09-08 19:01:51 UTC (rev 70077)
+++ zope.mimetype/trunk/src/zope/mimetype/typegetter.txt	2006-09-08 19:12:08 UTC (rev 70078)
@@ -226,6 +226,12 @@
   >>> typegetter.charsetGetter(content_type='text/plain; charset=mambo-42')
   'mambo-42'
 
+Note that the charset name is lowercased, because all the default ICharset
+and ICharsetCodec utilities are registered for lowercase names.
+
+  >>> typegetter.charsetGetter(content_type='text/plain; charset=UTF-8')
+  'utf-8'
+
 If it isn't, `charsetGetter` can try to guess by looking at actual data
 
   >>> typegetter.charsetGetter(content_type='text/plain', data='just text')



More information about the Checkins mailing list