# [Zope-CVS] CVS: Products/ZCTextIndex - Index.py:1.1.2.13

Tim Peters tim.one@comcast.net
Fri, 3 May 2002 01:16:11 -0400

```Update of /cvs-repository/Products/ZCTextIndex
In directory cvs.zope.org:/tmp/cvs-serv15170

Modified Files:
Tag: TextIndexDS9-branch
Index.py
Log Message:
doc_term_weight() and its use in _get_frequencies():  The former
returned a scaled int, roughly 256 * true_value.  The latter then squared
it, giving roughly 65536 * true_value**2.  It won't take an implausible
number of those before the sum overflows a signed 32-bit int (each int
is an artificial factor of 2**16 times "too big").  So changed the former
to return plain old true_value as a float, scaling only for storing and
*after* the sqrt of the sum is taken.

=== Products/ZCTextIndex/Index.py 1.1.2.12 => 1.1.2.13 ===
for wid in wids:
d[wid] = d.get(wid, 0) + 1
-        Wsquares = 0
+        Wsquares = 0.
freqs = []
for wid, count in d.items():
f = doc_term_weight(count)
-            Wsquares += f ** 2
-            freqs.append((wid, f))
-        return freqs, int(math.sqrt(Wsquares))
+            Wsquares += f * f
+            freqs.append((wid, scaled_int(f)))
+        return freqs, scaled_int(math.sqrt(Wsquares))

try:
@@ -164,7 +164,7 @@
def doc_term_weight(count):
"""Return the doc-term weight for a term that appears count times."""
# implements w(d, t) = 1 + log f(d, t)
-    return scaled_int(1 + math.log(count))
+    return 1. + math.log(count)

def query_term_weight(term_count, num_items):
"""Return the query-term weight for a term,

```