[Zope3-dev] Moving from custom indexed collection to Zope3...

Mike C. Fletcher mcfletch@rogers.com
Wed, 08 Jan 2003 07:53:35 -0500


[For those in a hurry, skip to the end for the conclusion.  I'm going 
ahead and posting in case someone else is interested in what ObjectHub 
does, or has input on other systems for IndexedProperty querying 
included with Zope3]

I have just finished the preliminary work for moving ConflictSolver to 
ZODB4.  The system works, and appears to be basically functional, but 
I'm having problems with my custom-coded collection classes.  I'm 
wondering what the Zope3 equivalent of these classes is, and whether it 
would be practical to move to that equivalent for a non-Zope ZODB4 
application (to pick up some of the "nice features" for "free" (minus 
switching effort)).

Since I'm pretty sure no one has actually seen the zcollection package 
(save some masochists who may have downloaded the ConflictSolver CVS 
tree), I will describe what it does:

    * Collection -- an unordered set, constructed from a persistent
      dictionary, a monotonous counter for object IDs, and a sequence of
      CollectionIndex objects.  
          o does incremental re-indexing whenever objects are
            added/removed, 
          o provides hooks for the objects to re-index themselves
            manually (which is normally done by the application, or the
            object's properties)
          o may define default indices it will create on initialization.
    * CollectionIndex -- as you might imagine, an individual index
      (currently the concrete classes are "natural order" (order
      maintained by a list object), and "bisect" (sorted list maintains
      order among values)).  
          o has all the hooks required to add/remove/calculate an
            individual object's values, 
          o search (key-lookup)
          o iteration (full and slice)

Now for what I need to accomplish:

    * Support "set" semantics for objects (i.e. there is no necessary
      requirement for objects to have unique names or other properties)
          o I had originally assumed this was the point of the object
            hub, i.e. holding pointers to objects which do not require
            other "positions" within the hierarchy... that appears to be
            wrong
    * Support (fast) indexed access to property values (these indices
      are the primary focus of the applications, and are accessed even
      in the GUI drawing code).  
          o These aren't once-per-minute queries with large result sets
            to be presented as segmented/batched documents to the user. 
          o They are multiple-times-per-second queries (five or six per
            GUI interaction) which need to return object collections in
            far less than a second.  
          o Queries are normally coded directly in the application, they
            don't need to be parsed from user input.
    * Support multiple indices per collection
    * Properly operate across deletions (my current code loses the
      back-pointer from the Index to the Collection when the undo
      trashes the Index object, I can fix that fairly easily, but I'd
      rather take the opportunity to move to a more general solution)
    * Support iteration, and range queries on indices ("all objects with
      startDate between x and y", or "all objects")
    * Would like to be able to support full-text indexing of given
      string attributes (the other reason to want to move to "standard"
      approaches)

Using ObjectHub:

    * I would need to include the ObjectHub object, and its supporting
      infrastructure
          o This infrastructure appears to be... extensive
          o At least the "services" architecture, the "interfaces"
            architecture...
    * I would need to modify the ObjectHub object itself to use
      something other than "location" for storing pointers to objects...
      seems like I must be misunderstanding:
          o Does ObjectHub actually just store paths, rather than ZODB
            object identifiers?  
          o The object seems to be just a persistent version of a
            multi-producer multi-consumer event dispatcher, is this a
            good characterization?
          o It doesn't seem to have any "collection" semantics at all,
            which leaves me wondering what object provides the semantics
            (does anything, or is everything "namespace"-based (i.e.
            Folder or similar dictionary-like collections with
            requirements for unique indices))?
    * I would need to support the "app/index" package, though this
      appears to be entirely Web-focused, so possibly not
    * Add the textindex to the ObjectHub as a subscriber (looking at the
      textindex code, it doesn't appear to have anything to do with the
      ObjectHub as of yet? I gather it is supposed to be readily
      reusable outside of Zope.)


Having spent a few hours researching now, I'm coming to the conclusion 
that trying to use ObjectHub for non-Zope index-attribute support just 
isn't appropriate... my assumptions about the nature of ObjectHub were 
incorrect.  I still haven't come up with what it is in Zope3 that does 
provide indexed-attribute support.  I know about IndexedCatalog by Async 
Open Source, but I had been under the impression that Zope included a 
similar system.  That impression may have been wrong...

Enjoy yourselves,
Mike

_______________________________________
  Mike C. Fletcher
  Designer, VR Plumber, Coder
  http://members.rogers.com/mcfletch/