[Zope] ISO-Splitter again: German Umlaute

Andreas Jung Andreas Jung" <andreas@zope.com
Wed, 9 Jan 2002 13:54:01 -0500


----- Original Message -----
From: "Joachim Werner" <joe@iuveno-net.de>
To: "Andreas Jung" <andreas@zope.com>
Cc: <zope@zope.org>
Sent: Wednesday, January 09, 2002 13:16
Subject: Re: [Zope] ISO-Splitter again: German Umlaute


> Sorry, I haven't had the time to look into the Splitter code yet. So
that's
> why I am asking again:
>
> I don't get the concept of having to specifiy the locale at Zope startup
for
> the catalog to work properly. What happens if I WANT en_US locale settings
> in general, but the catalog should be able to handle French, German, or
> Spanish words? How can I build multi-lingual Zope systems with that
concept?

ZopeSplitter + locale settings should fit your needs for all western
european
languages - they are all ISO-8859-1.
ISO-8859-1 splitter should fulfill your needs without change your locales.

>
> Shouldn't the catalog always split words correctly? I am not talking about
> languages like Japanese that have a different concept of splitting. Those
> need a different splitter code of course. But is there ANY reason why
German
> Umlauts or other language-specific special characters are supposed to be
> splitting characters, other than that the programmers of the original
> splitter code might have taken the easy way of making all characters that
> are not A-Z splitting characters?


A splitter is currently bound to a vocabulary. This means you can not change
the
splitter during indexing. For a multilingual environment you should use
Unicode
and use the new UnicodeSplitter.

Andreas