[Zope] Efficient Processing Of Large ZCatalog Queries

VanL vlindberg@verio.net
Thu, 17 Oct 2002 15:51:28 -0600


>
>
>That makes sense if you are only calling RESPONSE.write once the pseudocode 
>loop has finished. I think you need:
>
>for x in catalog.search(myquery):
>    ob = myFunction(getObject(x)) 
>    RESPONSE.write(formatting_function(ob))
>

I actually try to call RESPONSE.write in myFunction, above.  I actually 
have:

<dtml-in expr="Query(REQUEST.form)">
[formatting code for responses]
</dtml-in>

It is actually a bit more complicated than I originally expressed:  I 
can't hardcode myFunction(x)into the response page because I don't know 
in advance what function (if any) will be called on the query results. 
 I dynamically bind the name of the script to actual script at run time.

The calling chain looks like this:

Form
    Response Page
       Query Control Script
            Query Preparsing (separate out control and query fields)
        Query Control
             Catalog Query (We actually query the catalog at this point, 
return result.getObject)
         Query Control
              Association Manager (given a list of objects, returns a 
list of related objects)
         Query Control
               Object Processor
                     NameBinding (Associates the name of a script with 
the actual script object)
               Object Processor (Calls the script on each object in turn)
               ** This is where I try to call RESPONSE.write and it 
doesn't seem to work**
        Query Control
     Response Page (displays result)         

I  realize this is rather complex, but this gives me a simple API which 
I can write to that allows me to run an arbitrary script on an arbitrary 
set of input objects.  Performance seems to be reasonable for smaller 
input sets -- going through this calling chain does not appear to take 
significantly longer than just doing a straight query for result sets up 
to about 500.  (unless, of course, the called script does something that 
takes a lot of time).

For larger queries, though, it seems to take an extraordinarily long 
time to return.  I'm trying to figure out where the problem is and fix it.

You said that perhaps I could do:

>for x in catalog.search(myquery)[start:end]
>

 That is reasonable, but would I only process the objects 
between[start:end], or would I process *all* the objects, but only 
truncate the dsiplayed result?

Thanks,

Van