[Zope-dev] Modifying Splitter.c to search on '+' & '#', and single letter words

Harry Wilkinson harryw@nipltd.com
Wed, 25 Jul 2001 17:50:39 +0100


I have two problems with getting ZCatalog to search for what I need:

1) Need to be able to search for words like 'J++' and 'C#'
       - this is relatively simple to do by editing Splitter.c a little
and recompiling
2) Need to be able to search for single-letter words like 'C'
       - this is easy to modify Splitter.c to accomodate, but causes
errors in GlobbingLexicon.py, even though the vocabulary is standard

So far I have solved problem (1) by changing the contents of Splitter.c,
but that's a bit messy.  Currently I don't know of an alternative
though.

I have modified Splitter.c so it indexes the extra characters, and
reduced the mimimum word length to 1, which works fine when indexing,
and I can see all the symbol-inclusive words and single-letter words in
the vocabulary.  Unfortunately, any search on a single-letter word gives
an IndexError, "String out of range".

I am stuck on problem (2) and don't know how to avoid the errors arising
in GlobbingLexicon.py without editing in some kind of hack to get around
it.  I don't even know why GlobbingLexicon is getting involved in the
search process since I am not trying to use wildcards and haven't
elected to use a globbing vocabulary (AFAIK).

Can anyone explain why GlobbingLexicon is involved?  Better yet, has
anyone faced this problem (2) before, or come up with a more elegant
solution to (1) ?

Thanks for your help :)

Harry