[Zope-Checkins] CVS: Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer - CHANGELOG:1.1.2.1 LICENSE.TXT:1.1.2.1 MANIFEST.in:1.1.2.1 Makefile.pre.in:1.1.2.1 README:1.1.2.1 Setup:1.1.2.1 setup.py:1.1.2.1

Andreas Jung andreas@digicool.com
Wed, 13 Feb 2002 11:26:15 -0500


Update of /cvs-repository/Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer
In directory cvs.zope.org:/tmp/cvs-serv30556/PyStemmer

Added Files:
      Tag: ajung-textindexng-branch
	CHANGELOG LICENSE.TXT MANIFEST.in Makefile.pre.in README Setup 
	setup.py 
Log Message:
added PyStemmer


=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/CHANGELOG ===
0.01 - 12-17-2001

  - first public release


0.1 - 01-10-2002

  - destructor frees all memory now
  - change license from BSD to MIT license
  - added internal caching
  - improved cleanup
  - updated README




=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/LICENSE.TXT ===
The MIT License

Copyright (c) 2001 Andreas Jung 

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.



=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/MANIFEST.in ===
recursive-include * *.txt *.sbl *.html *.c *.h
README 
CHANGELOG


=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/Makefile.pre.in ===
# Universal Unix Makefile for Python extensions
# =============================================

# Short Instructions
# ------------------

# 1. Build and install Python (1.5 or newer).
# 2. "make -f Makefile.pre.in boot"
# 3. "make"
# You should now have a shared library.

# Long Instructions
# -----------------

# Build *and install* the basic Python 1.5 distribution.  See the
# Python README for instructions.  (This version of Makefile.pre.in
# only withs with Python 1.5, alpha 3 or newer.)

# Create a file Setup.in for your extension.  This file follows the
# format of the Modules/Setup.dist file; see the instructions there.
# For a simple module called "spam" on file "spammodule.c", it can
# contain a single line:
#   spam spammodule.c
# You can build as many modules as you want in the same directory --
# just have a separate line for each of them in the Setup.in file.

# If you want to build your extension as a shared library, insert a
# line containing just the string
#   *shared*
# at the top of your Setup.in file.

# Note that the build process copies Setup.in to Setup, and then works
# with Setup.  It doesn't overwrite Setup when Setup.in is changed, so
# while you're in the process of debugging your Setup.in file, you may
# want to edit Setup instead, and copy it back to Setup.in later.
# (All this is done so you can distribute your extension easily and
# someone else can select the modules they actually want to build by
# commenting out lines in the Setup file, without editing the
# original.  Editing Setup is also used to specify nonstandard
# locations for include or library files.)

# Copy this file (Misc/Makefile.pre.in) to the directory containing
# your extension.

# Run "make -f Makefile.pre.in boot".  This creates Makefile
# (producing Makefile.pre and sedscript as intermediate files) and
# config.c, incorporating the values for sys.prefix, sys.exec_prefix
# and sys.version from the installed Python binary.  For this to work,
# the python binary must be on your path.  If this fails, try
#   make -f Makefile.pre.in Makefile VERSION=1.5 installdir=<prefix>
# where <prefix> is the prefix used to install Python for installdir
# (and possibly similar for exec_installdir=<exec_prefix>).

# Note: "make boot" implies "make clobber" -- it assumes that when you
# bootstrap you may have changed platforms so it removes all previous
# output files.

# If you are building your extension as a shared library (your
# Setup.in file starts with *shared*), run "make" or "make sharedmods"
# to build the shared library files.  If you are building a statically
# linked Python binary (the only solution of your platform doesn't
# support shared libraries, and sometimes handy if you want to
# distribute or install the resulting Python binary), run "make
# python".

# Note: Each time you edit Makefile.pre.in or Setup, you must run
# "make Makefile" before running "make".

# Hint: if you want to use VPATH, you can start in an empty
# subdirectory and say (e.g.):
#   make -f ../Makefile.pre.in boot srcdir=.. VPATH=..


# === Bootstrap variables (edited through "make boot") ===

# The prefix used by "make inclinstall libainstall" of core python
installdir=	/usr/local

# The exec_prefix used by the same
exec_installdir=$(installdir)

# Source directory and VPATH in case you want to use VPATH.
# (You will have to edit these two lines yourself -- there is no
# automatic support as the Makefile is not generated by
# config.status.)
srcdir=		.
VPATH=		.

# === Variables that you may want to customize (rarely) ===

# (Static) build target
TARGET=		python

# Installed python binary (used only by boot target)
PYTHON=		python

# Add more -I and -D options here
CFLAGS=		$(OPT) -I$(INCLUDEPY) -I$(EXECINCLUDEPY) $(DEFS)

# These two variables can be set in Setup to merge extensions.
# See example[23].
BASELIB=	
BASESETUP=	

# === Variables set by makesetup ===

MODOBJS=	_MODOBJS_
MODLIBS=	_MODLIBS_

# === Definitions added by makesetup ===

# === Variables from configure (through sedscript) ===

VERSION=	@VERSION@
CC=		@CC@
LINKCC=		@LINKCC@
SGI_ABI=	@SGI_ABI@
OPT=		@OPT@
LDFLAGS=	@LDFLAGS@
LDLAST=		@LDLAST@
DEFS=		@DEFS@
LIBS=		@LIBS@
LIBM=		@LIBM@
LIBC=		@LIBC@
RANLIB=		@RANLIB@
MACHDEP=	@MACHDEP@
SO=		@SO@
LDSHARED=	@LDSHARED@
CCSHARED=	@CCSHARED@
LINKFORSHARED=	@LINKFORSHARED@
CXX=		@CXX@

# Install prefix for architecture-independent files
prefix=		/usr/local

# Install prefix for architecture-dependent files
exec_prefix=	$(prefix)

# Uncomment the following two lines for AIX
#LINKCC= 	$(LIBPL)/makexp_aix $(LIBPL)/python.exp "" $(LIBRARY); $(PURIFY) $(CC)
#LDSHARED=	$(LIBPL)/ld_so_aix $(CC) -bI:$(LIBPL)/python.exp

# === Fixed definitions ===

# Shell used by make (some versions default to the login shell, which is bad)
SHELL=		/bin/sh

# Expanded directories
BINDIR=		$(exec_installdir)/bin
LIBDIR=		$(exec_prefix)/lib
MANDIR=		$(installdir)/man
INCLUDEDIR=	$(installdir)/include
SCRIPTDIR=	$(prefix)/lib

# Detailed destination directories
BINLIBDEST=	$(LIBDIR)/python$(VERSION)
LIBDEST=	$(SCRIPTDIR)/python$(VERSION)
INCLUDEPY=	$(INCLUDEDIR)/python$(VERSION)
EXECINCLUDEPY=	$(exec_installdir)/include/python$(VERSION)
LIBP=		$(exec_installdir)/lib/python$(VERSION)
DESTSHARED=	$(BINLIBDEST)/site-packages

LIBPL=		$(LIBP)/config

PYTHONLIBS=	$(LIBPL)/libpython$(VERSION).a

MAKESETUP=	$(LIBPL)/makesetup
MAKEFILE=	$(LIBPL)/Makefile
CONFIGC=	$(LIBPL)/config.c
CONFIGCIN=	$(LIBPL)/config.c.in
SETUP=		$(LIBPL)/Setup.config $(LIBPL)/Setup.local $(LIBPL)/Setup

SYSLIBS=	$(LIBM) $(LIBC)

ADDOBJS=	$(LIBPL)/python.o config.o

# Portable install script (configure doesn't always guess right)
INSTALL=	$(LIBPL)/install-sh -c
# Shared libraries must be installed with executable mode on some systems;
# rather than figuring out exactly which, we always give them executable mode.
# Also, making them read-only seems to be a good idea...
INSTALL_SHARED=	${INSTALL} -m 555

# === Fixed rules ===

# Default target.  This builds shared libraries only
default:	sharedmods

# Build everything
all:		static sharedmods

# Build shared libraries from our extension modules
sharedmods:	$(SHAREDMODS)

# Build a static Python binary containing our extension modules
static:		$(TARGET)
$(TARGET):	$(ADDOBJS) lib.a $(PYTHONLIBS) Makefile $(BASELIB)
		$(LINKCC) $(LDFLAGS) $(LINKFORSHARED) \
		 $(ADDOBJS) lib.a $(PYTHONLIBS) \
		 $(LINKPATH) $(BASELIB) $(MODLIBS) $(LIBS) $(SYSLIBS) \
		 -o $(TARGET) $(LDLAST)

install:	sharedmods
		if test ! -d $(DESTSHARED) ; then \
			mkdir $(DESTSHARED) ; else true ; fi
		-for i in X $(SHAREDMODS); do \
			if test $$i != X; \
			then $(INSTALL_SHARED) $$i $(DESTSHARED)/$$i; \
			fi; \
		done

# Build the library containing our extension modules
lib.a:		$(MODOBJS)
		-rm -f lib.a
		ar cr lib.a $(MODOBJS)
		-$(RANLIB) lib.a 

# This runs makesetup *twice* to use the BASESETUP definition from Setup
config.c Makefile:	Makefile.pre Setup $(BASESETUP) $(MAKESETUP)
		$(MAKESETUP) \
		 -m Makefile.pre -c $(CONFIGCIN) Setup -n $(BASESETUP) $(SETUP)
		$(MAKE) -f Makefile do-it-again

# Internal target to run makesetup for the second time
do-it-again:
		$(MAKESETUP) \
		 -m Makefile.pre -c $(CONFIGCIN) Setup -n $(BASESETUP) $(SETUP)

# Make config.o from the config.c created by makesetup
config.o:	config.c
		$(CC) $(CFLAGS) -c config.c

# Setup is copied from Setup.in *only* if it doesn't yet exist
Setup:
		cp $(srcdir)/Setup.in Setup

# Make the intermediate Makefile.pre from Makefile.pre.in
Makefile.pre: Makefile.pre.in sedscript
		sed -f sedscript $(srcdir)/Makefile.pre.in >Makefile.pre

# Shortcuts to make the sed arguments on one line
P=prefix
E=exec_prefix
H=Generated automatically from Makefile.pre.in by sedscript.
L=LINKFORSHARED

# Make the sed script used to create Makefile.pre from Makefile.pre.in
sedscript:	$(MAKEFILE)
	sed -n \
	 -e '1s/.*/1i\\/p' \
	 -e '2s%.*%# $H%p' \
	 -e '/^VERSION=/s/^VERSION=[ 	]*\(.*\)/s%@VERSION[@]%\1%/p' \
	 -e '/^CC=/s/^CC=[ 	]*\(.*\)/s%@CC[@]%\1%/p' \
	 -e '/^CXX=/s/^CXX=[ 	]*\(.*\)/s%@CXX[@]%\1%/p' \
	 -e '/^LINKCC=/s/^LINKCC=[ 	]*\(.*\)/s%@LINKCC[@]%\1%/p' \
	 -e '/^OPT=/s/^OPT=[ 	]*\(.*\)/s%@OPT[@]%\1%/p' \
	 -e '/^LDFLAGS=/s/^LDFLAGS=[ 	]*\(.*\)/s%@LDFLAGS[@]%\1%/p' \
	 -e '/^LDLAST=/s/^LDLAST=[      ]*\(.*\)/s%@LDLAST[@]%\1%/p' \
	 -e '/^DEFS=/s/^DEFS=[ 	]*\(.*\)/s%@DEFS[@]%\1%/p' \
	 -e '/^LIBS=/s/^LIBS=[ 	]*\(.*\)/s%@LIBS[@]%\1%/p' \
	 -e '/^LIBM=/s/^LIBM=[ 	]*\(.*\)/s%@LIBM[@]%\1%/p' \
	 -e '/^LIBC=/s/^LIBC=[ 	]*\(.*\)/s%@LIBC[@]%\1%/p' \
	 -e '/^RANLIB=/s/^RANLIB=[ 	]*\(.*\)/s%@RANLIB[@]%\1%/p' \
	 -e '/^MACHDEP=/s/^MACHDEP=[ 	]*\(.*\)/s%@MACHDEP[@]%\1%/p' \
	 -e '/^SO=/s/^SO=[ 	]*\(.*\)/s%@SO[@]%\1%/p' \
	 -e '/^LDSHARED=/s/^LDSHARED=[ 	]*\(.*\)/s%@LDSHARED[@]%\1%/p' \
	 -e '/^CCSHARED=/s/^CCSHARED=[ 	]*\(.*\)/s%@CCSHARED[@]%\1%/p' \
	 -e '/^SGI_ABI=/s/^SGI_ABI=[ 	]*\(.*\)/s%@SGI_ABI[@]%\1%/p' \
	 -e '/^$L=/s/^$L=[ 	]*\(.*\)/s%@$L[@]%\1%/p' \
	 -e '/^$P=/s/^$P=\(.*\)/s%^$P=.*%$P=\1%/p' \
	 -e '/^$E=/s/^$E=\(.*\)/s%^$E=.*%$E=\1%/p' \
	 $(MAKEFILE) >sedscript
	echo "/^installdir=/s%=.*%=	$(installdir)%" >>sedscript
	echo "/^exec_installdir=/s%=.*%=$(exec_installdir)%" >>sedscript
	echo "/^srcdir=/s%=.*%=		$(srcdir)%" >>sedscript
	echo "/^VPATH=/s%=.*%=		$(VPATH)%" >>sedscript
	echo "/^LINKPATH=/s%=.*%=	$(LINKPATH)%" >>sedscript
	echo "/^BASELIB=/s%=.*%=	$(BASELIB)%" >>sedscript
	echo "/^BASESETUP=/s%=.*%=	$(BASESETUP)%" >>sedscript

# Bootstrap target
boot:	clobber
	VERSION=`$(PYTHON) -c "import sys; print sys.version[:3]"`; \
	installdir=`$(PYTHON) -c "import sys; print sys.prefix"`; \
	exec_installdir=`$(PYTHON) -c "import sys; print sys.exec_prefix"`; \
	$(MAKE) -f $(srcdir)/Makefile.pre.in VPATH=$(VPATH) srcdir=$(srcdir) \
		VERSION=$$VERSION \
		installdir=$$installdir \
		exec_installdir=$$exec_installdir \
		Makefile

# Handy target to remove intermediate files and backups
clean:
		-rm -f *.o *~

# Handy target to remove everything that is easily regenerated
clobber:	clean
		-rm -f *.a tags TAGS config.c Makefile.pre $(TARGET) sedscript
		-rm -f *.so *.sl so_locations


# Handy target to remove everything you don't want to distribute
distclean:	clobber
		-rm -f Makefile Setup


=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/README ===
PyStemmer 
---------

What is PyStemmer ?

  PyStemmer provides a unique interface to the SnowBall stemmers
  (snowball.sourceforge.net).  A stemming algorithm (or stemmer) is a process for
  removing the commoner morphological and inflexional endings from words in
  English. Its main use is as part of a term normalisation process that is
  usually done when setting up Information Retrieval systems.  A stemmer is
  reduces a given word to its linguistic base form. Stemmesr are language
  dependent and are often used in text indexing environments.  Stemmers can be
  used to make searches more precise. E.g.  searching for 'cars' will also find
  all documents that contain only 'car' because they share both the same
  linguistic base form 'car'. 

  Snowball (http://snowball.sourceforge.net) is a small string processing
  language designed for creating stemming algorithms for use in Information
  Retrieval.   
 

Requirements

  Python 2.1 or higher  (tested with 2.0 - 2.2)


Installation

  via Distutils:

    python setup.py [build|install]


  via Makefile.pre.in:

    make -f Makefile.pre.in boot

    make

    make install


API

  import Stemmer
  print Stemmer.availableStemmers()    # returns a list of all supported languages

  ST = Stemmer.Stemmer('german')       # create a german Stemmer object
  print ST.stem('blabla')              # stem one word

  print ST.stem(['wort1','wort2'])     # stem a list of words
  print ST.language()                  # returns the language of the stemmer object

  ST.setCacheSize(10000)               # cache up to 10000 stemmed words
  print ST.getCacheSize()              # return size of internal stemmer cache


License

  All this software is covered by the MIT license with 
  (C) 2001, Andreas Jung (see LICENSE.TXT).

  Snowball is published under BSD license (C) 2001, Dr. Martin Porter


Author

  Andreas Jung (andreas@andreas-jung.com)



=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/Setup ===
*shared*
Stemmer -DDEBUG -Iq -I.\
        src/Stemmer.c   \
        french/frenchstem.c \
        porter/porterstem.c \
        german/germanstem.c \
        dutch/dutchstem.c \
        english/englishstem.c \
        spanish/spanishstem.c \
        italian/italianstem.c \
        swedish/swedishstem.c \
        portuguese/portuguesestem.c \
        russian/russianstem.c \
        danish/danishstem.c \
        norwegian/norwegianstem.c \
        q/api.c q/utilities.c


=== Added File Zope/lib/python/Products/PluginIndexes/TextIndexNG/src/PyStemmer/setup.py ===
#!/usr/bin/env python

from distutils.core import setup,Extension

setup(name = "PyStemmer",
      version = "0.10",
      author = "Andreas Jung",
      author_email = "andreas@andreas-jung.com",
	  ext_modules=[Extension("Stemmer",
                    [
                "src/Stemmer.c",
                "french/frenchstem.c",
                "porter/porterstem.c",
                "german/germanstem.c",
                "dutch/dutchstem.c",
                "english/englishstem.c",
                "spanish/spanishstem.c",
                "italian/italianstem.c",
                "swedish/swedishstem.c",
                "portuguese/portuguesestem.c",
                "russian/russianstem.c",
                "danish/danishstem.c",
                "norwegian/norwegianstem.c",
                "q/api.c",
                "q/utilities.c"
                    ],
                            include_dirs=['q','.'],
					)]
	)