[Zope-CMF] Using CMF/Plone for Document Management?

matt.bartolome at uniontrib.com matt.bartolome at uniontrib.com
Thu Oct 30 14:22:01 EST 2003


It should be fairly easy. If you can effectively manage the data on the file
system you should be able to import the files you have and index them with
the ocr data. You can merge the actual file with ocr data in cmf/plone. This
is what you want right?

I worked on an AP photo content type that sounds similar to what you're
doing. I grab a thousand or so image files and their metadata for
searchability from an xml feed (which makes it easier to associate the data
in my case). To import the data into plone/cmf I first created an Archetype
based product that has an image field for the image data and other
string/text fields for searchability. I import it all at once by an out of
process python script where I can set and index any metadata fields I want. 

-----Original Message-----
From: Kelley, Sean [mailto:SKelley at ci.santa-rosa.ca.us]
Sent: Thursday, October 30, 2003 10:12 AM
To: 'scott.meilicke at intp.com'
Cc: 'zope-cmf at zope.org'
Subject: [Zope-CMF] Using CMF/Plone for Document Management?


I am in process of selecting scanners and OCR software and will be saving
the documents to the file system.  The question is how to easily get
everything into cmf/plone.  That's why I mentioned the use of TextIndexNG
and CMFExternalFile.  I am thinking if I can use something like
CMFExternalFile, I can manage volumes of files and add them but I need to be
able to search them as well as auto set meta data on import such as
department etc.

Message: 3
Date: Wed, 29 Oct 2003 12:22:14 -0800
From: "Meilicke, Scott" <scott.meilicke at intp.com>
Subject: RE: [Zope-CMF] Using CMF/Plone for Document Management?
To: "'zope-cmf at zope.org'" <zope-cmf at zope.org>
Message-ID:
	<05A1EDFF79040846976782B13F97ED35028906AE at webmail.intp.com>
Content-Type: text/plain;	charset="iso-8859-1"

>From what I have (just) read, pdftotext does not do OCR.  PDFs created from
scanners therefore will not have anything to index, unless you put an OCR
program in the mix.

Scott

I want to setup PDF scanners and use CMF/Plone for document management.  My
thinking is that I already have PDF docs searchable with textIndexNG for
people who manually add documents, but how do I automate the process of
adding several documents at once?  I have tested CMFExternalFile (which is
not allowing full test searches on PDF docs for some reason).  What if I
want to allow users to upload batches of documents which pre set fields like
"department" and other custom searchable pre set fields?  Is CMFExternalFile
the answer?  Is there a better way?  Does CMFExternalFile just need to be
customized?

I am currently running...
CMFExternalFile 0.5
zope 2.6.1
pdftotext (latest windows version)
TextIndexNG2 (2.0.1)
Plone 1.0.3

Sean

_______________________________________________
Zope-CMF maillist  -  Zope-CMF at zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests



More information about the Zope-CMF mailing list