[ZCM] [ZC] 1452/ 8 Comment "manage_catalogIndexes still very very slow with ZCTextIndex"

Collector: Zope Bugs, Features, and Patches ... zope-coders-admin at zope.org
Thu Aug 5 07:13:31 EDT 2004


Issue #1452 Update (Comment) "manage_catalogIndexes still very very slow with ZCTextIndex"
 Status Pending, Zope/bug medium
To followup, visit:
  http://zope.org/Collectors/Zope/1452

==============================================================
= Comment - Entry #8 by chrisw on Aug 5, 2004 7:13 am

Interesting. I re-indexed this index last night and it seems to have cleared the problem.

Here's what I found:
>>> app.Catalog.Indexes['SearchableText'].numObjects
<bound method ZCTextIndex.numObjects of <ZCTextIndex at SearchableText>>
>>> app.Catalog.Indexes['SearchableText'].index     
<OkapiIndex instance at 4159c2f0>
>>> app.Catalog.Indexes['SearchableText'].index.length
<Length instance at 4159c890>

>>> print time.asctime(); app.Catalog.Indexes['SearchableText'].index.length() ; print time.asctime();
Thu Aug  5 12:09:18 2004
94284
Thu Aug  5 12:09:19 2004
>>> print time.asctime(); app.Catalog.Indexes['SearchableText'].numObjects() ; print time.asctime();
Thu Aug  5 12:09:47 2004
94284
Thu Aug  5 12:09:47 2004

I suspect ZCTextIndex didn't used to do this, and so if you upgrade and don't re-index, you end up with suckage still being there.

Lemme see if I can find another upgraded Zope instance and check if I can reproduce.
________________________________________
= Comment - Entry #7 by chrisw on Aug 5, 2004 7:05 am

Finally having the joy of being able to wield "zopectl debug", I can tell you:

>>> app.Catalog.Indexes['SearchableText']
<ZCTextIndex at SearchableText>
>>> app.Catalog.Indexes['SearchableText'].length
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: length

...where should I be looking for this length attribute?
________________________________________
= Comment - Entry #6 by Caseman on Aug 3, 2004 2:54 pm

PLease note the following code in BaseIndex.py

> class BaseIndex(Persistent):
> 
>     __implements__ = IIndex
> 
>     def __init__(self, lexicon):
>         ...
>         # Use a BTree length for efficient length computation w/o conflicts
>         self.length = Length()
>         self.document_count = Length()
> 
>     ...

Note how the length attribute is overridden in the index instance. It would be interesting to see if this is somehow not working as intended in Chris' case. That was why I recommended changing the length() method to raise an exception. It should never be exercised in practice.
________________________________________
= Comment - Entry #5 by ajung on Aug 3, 2004 2:38 pm

The ZMI calls numObjects() which return self.index.length(). The only length() implementation in ZCTextIndex is the one in BaseIndex.py.
________________________________________
= Comment - Entry #4 by Caseman on Aug 3, 2004 1:45 pm

Notice the comment above this line and line 86 of BaseIndex.py in the __init__() method. The length() method in the class is replaced by a callable Length object in the index index.

IMO this tries too hard to be clever. A reasonable change that might help understanding would be to have the length method raise NotImplementedError, since it is just a placeholder. Eliminating the method might be ok also, implicit acquisition notwithstanding.
________________________________________
= Comment - Entry #3 by ajung on Aug 3, 2004 1:12 pm

BaseIndex.length() calls len() on a BTree instead of using the value of the existing counter self.document_count
________________________________________
= Comment - Entry #2 by chrisw on Aug 3, 2004 12:33 pm

Just to note that ZCTextIndex is definitely the culprite here.

I deleted the index (which is all I wanted to do anyway ;-) and suddenly the indexes tab was back to being lightening quick.
________________________________________
= Request - Entry #1 by chrisw on Aug 3, 2004 12:29 pm

The Indexes tab of a ZCatalog will craaaawl if there's a ZCTextIndex in there which has a decent number of words indexed in it.

It think something in the indexes tab ends up doing a len of a very big BTree.

Commenting out the "# objects" column makes it lightning quick again.

What does that column actually show?
Why are ZCTextIndex's so slow in it?

This really needs fixing...
==============================================================



More information about the Zope-Collector-Monitor mailing list