[Zope-dev] Re: Showstopper UnicodeDecodeError on Zope import???

Nick Bower nicholas.bower at jrc.it
Fri Nov 5 09:33:28 EST 2004


I have just returned from vacation and am looking at this issue in more
detail.  Dieter's explanation seems logical, especially considering the
following traceback when I try and move a folder containing these
specific CMF object types.

Looking at the objects, they have plain text ids (I assume by looking
only), but unicode titles (again by just looking in ZMI).  The latter
were probably copied and pasted from somehwere into a Cmf product's
editting screen, which I presume is the problem here.

So following Dieter's explanation, is it possible to find and identify
which objects have the non-unicode/non-ascii ids/titles using some
python?  I'm assuming that I could then edit the offending object's
id/title in the ZMI to "re-unicode" it.

Changing the default character encoding of my production server for this
one application just isn't an option unfortunately.

Thanks, nick

Traceback (innermost last):

     * Module ZPublisher.Publish, line 101, in publish
     * Module ZPublisher.mapply, line 88, in mapply
     * Module ZPublisher.Publish, line 39, in call_object
     * Module OFS.CopySupport, line 231, in manage_renameObjects
     * Module OFS.CopySupport, line 260, in manage_renameObject
     * Module OFS.ObjectManager, line 276, in _setObject
     * Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd
     * Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse
     * Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd
     * Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse
     * Module Products.CMFCore.CMFCatalogAware, line 148, in manage_afterAdd
     * Module Products.CMFCore.CMFCatalogAware, line 177, in __recurse
     * Module Products.CMFCore.CMFCatalogAware, line 147, in manage_afterAdd
     * Module Products.CMFCore.CMFCatalogAware, line 42, in indexObject
     * Module Products.CMFPlone.CatalogTool, line 56, in indexObject
     * Module Products.CMFCore.CatalogTool, line 235, in catalog_object
     * Module Products.ZCatalog.ZCatalog, line 528, in catalog_object
     * Module Products.ZCatalog.Catalog, line 381, in catalogObject
     * Module Products.ZCTextIndex.ZCTextIndex, line 163, in index_object
     * Module Products.ZCTextIndex.ZCTextIndex, line 176, in _index_object
     * Module Products.ZCTextIndex.OkapiIndex, line 58, in index_doc
     * Module Products.ZCTextIndex.BaseIndex, line 108, in index_doc
     * Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds
     * Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate

UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5:
ordinal not in range(128)



Dieter Maurer wrote:
> Nick Bower wrote at 2004-10-8 16:41 +0200:
> 
>>...
>>Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds
>>Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate
>>
>>UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: 
>>orginal not in range(128).
> 
> 
> In your lexicon operation a unicode and a non-unicode string is
> put together (this can happen internally during BTree traversal).
> 
> Whenever such a thing happens, Python tries to convert the
> non unicode to unicode -- using its default encoding.
> This fails as the non unicode string contains bytes not convertable
> this this encoding.
> 
> In a later message you reported that setting Python's default
> encoding to "utf-8" gave you an unexpected end exception.
> This means that your non unicode string is not utf-8 encoded.
> 
> 
> You should use as default encoding the encoding that is
> used for your non unicode strings.
> 
> If you do not know it, use an encoding that can map any 8 bit byte.
> Windows has a few of them (called "cpXXX" (for CodePage);
> I do not know the correct XXX).
> 




More information about the Zope-Dev mailing list