[Zope-dev] ZCatalog : UTF-8 Chinese

Sin Hang Kin kentsin@poboxes.com
Tue, 26 Sep 2000 19:01:44 +0800


After reading the source, I realize it is not bug but relate to zcatalog's
design.

I believe that Zcatalog parse the input string for expressions, however, it
take the string as byte-string without convert it from utf-8. What I think
is that the parse process break the search expression so that the search
fail.

What I consider to do is to by-pass parse and parse2 in UNTEXTINDEX.PY of
query. The other way is to convert the input into unicode (or at least
parse-safe) but it seems a big trouble to include the unicode code here.

Can you give some comments on by passing parse(2) ? Will this work? I am not
very sure about my choice.

Rgs,

Kent Sin
----- Original Message -----
From: "Zope mailing lists" <bitz@caller.bitdance.com>
To: "Sin Hang Kin" <kentsin@poboxes.com>
Cc: <zope-dev@zope.org>
Sent: Monday, September 25, 2000 11:59 PM
Subject: Re: [Zope-dev] ZCatalog : UTF-8 Chinese


> On Mon, 25 Sep 2000, Sin Hang Kin wrote:
> > I generate the search interface, and test it. However, the search of the
> > index terms return nothings. I search most entries found in the
vocubalury
> > but none works, those work will return many unwanted results also.
> >
> > What is causing this failure? What I can do to go further?
>
> It is possible you are having an issue with the way the splitter
> is used on the search term input side.  Several of us have found
> bugs in that area.  We've fixed the ones we've found, but there
> may be more <wry grin>.
>
> Run zope in debug mode ("the debugger is your friend" howto), and
> watch what UnTextIndex does with the search terms.  (Hint: instead
> of trying to set breakpoints per the howto, just uncomment the
> appropriate calls to the debugger in the UnTextIndex or Lexicon
> source file...)
>
> --RDM
>
>