[Zope-CMF] TextIndexNG on Word, PDF files

Damon Butler damon@hddesign.com
24 Jan 2003 12:15:33 -0600


On FreeBSD 4.7 I'm running Zope 2.5.1 that I've patched with all the
Unicode diffs from http://www.zope.org/Members/efge/i18n/Unicode-2.5.1.
I've also installed Localizer 1.0.0, hoping to ensure that my install of
the TextIndexNG Product (v. 1.08) works properly.

And, for the most part, it sure does seem to. With one key exception. I
can't get it to index the content of Word files or PDF files (the two
binary formats I'm most concerned with). Why is it, whenver I upload a
Word file or PDF file (via Adding a File), both a standard Zope CMF site
and a Plone site refuse to let me change the MIME type to
application/msword? In either interface, I can certainly select
application/msword, but it *never* sticks. The view always shows the
files to be application/octet-stream.

My hypothesis, you see, is that TextIndexNG, not having a converter
associated with application/octet-stream (I've got all the necessary
converters registered for Word, PDF, etc.), never attempts to index the
text contents of my Word and PDF files.

Interestingly, I tried uploading a LaTeX file, and it too was always
showing as application/octet-stream (regardless of what I selected for
the MIME type) ... but TextIndexNG can and would index its contents.

Any explanation of what's going on? Any way to change this? What did I
do wrong?

--Damon Butler