[Checkins] SVN: zc.relation/trunk/src/zc/relation/ Try to make README cover all necessary bases. Add static values for transitive query factory and transitive search index. Multiple small changes.

Gary Poster gary at zope.com
Tue Apr 22 16:34:38 EDT 2008


Log message for revision 85617:
  Try to make README cover all necessary bases.  Add static values for transitive query factory and transitive search index.  Multiple small changes.

Changed:
  U   zc.relation/trunk/src/zc/relation/README.txt
  U   zc.relation/trunk/src/zc/relation/catalog.py
  U   zc.relation/trunk/src/zc/relation/interfaces.py
  U   zc.relation/trunk/src/zc/relation/queryfactory.py
  U   zc.relation/trunk/src/zc/relation/searchindex.py
  U   zc.relation/trunk/src/zc/relation/searchindex.txt
  U   zc.relation/trunk/src/zc/relation/tokens.txt

-=-
Modified: zc.relation/trunk/src/zc/relation/README.txt
===================================================================
--- zc.relation/trunk/src/zc/relation/README.txt	2008-04-22 19:49:08 UTC (rev 85616)
+++ zc.relation/trunk/src/zc/relation/README.txt	2008-04-22 20:34:37 UTC (rev 85617)
@@ -9,30 +9,49 @@
 ========
 
 The relation catalog can be used to optimize intransitive and transitive
-searches for N-ary relations of finite, preset dimensions; can be used
-in the ZODB or standalone (though it uses ZODB classes--BTrees and
-persistent.Persistent--so the ZODB software must be present); and can be
-used with variable definitions of transitive behavior. It is a generic,
-relatively policy-free tool.  It is expected to be used primarily as an
-engine for more specialized and constrained tools and APIs.  Three such
-tools are zc.relationship containers, plone.relations containers, and
-zc.vault.  The documents in the package, including this one, describe
-other possible uses.
+searches for N-ary relations of finite, preset dimensions.
 
-The relation catalog uses the model that relations are full-fledged
-objects that are indexed for optimized searches.  It takes a very
-precise view of the world: instantiation requires multiple arguments
-specifying the configuration; and using the catalog requires that you
-acknowledge that the relations and their associated indexed values are
-usually tokenized within the catalog, via tokenizers and resolvers you
-specify.  This precision trades some ease-of-use for the possibility of
-flexibility, power, and efficiency.  That said, the catalog's API is
-intended to be consistent, and to largely adhere to "there's only one
-way to do it".
+For example, you can index simple two-way relations, like employee to
+supervisor; RDF-style triples of subject-predicate-object; and more complex
+relations such as subject-predicate-object with context and state.  These
+can be searched with variable definitions of transitive behavior.
 
+The catalog can be used in the ZODB or standalone. It is a generic, relatively
+policy-free tool.
+
+It is expected to be used usually as an engine for more specialized and
+constrained tools and APIs. Three such tools are zc.relationship containers,
+plone.relations containers, and zc.vault. The documents in the package,
+including this one, describe other possible uses.
+
 Setting Up a Relation Catalog
 =============================
 
+In this section, we will be introducing the following ideas.
+
+- Relations are objects with indexed values.
+
+- You add value indexes to relation catalogs to be able to search.  Values
+  can be identified to the catalog with callables or interface elements. The
+  indexed value must be specified to the catalog as a single value or a
+  collection.
+
+- Relations and their values are stored in the catalog as tokens: unique
+  identifiers that you can resolve back to the original value. Integers are the
+  most efficient tokens, but others can work fine too.
+
+- Token type determines the BTree module needed.
+
+- You must define your own functions for tokenizing and resolving tokens. These
+  functions are registered with the catalog for the relations and for each of
+  their value indexes.
+
+- Relations are indexed with ``index``.
+
+We will use a simple two way relationship as our example here. A brief
+introduction to a more complex RDF-style subject-predicate-object set up
+can be found later in the document.
+
 Creating the Catalog
 --------------------
 
@@ -43,15 +62,26 @@
 the supervisor, and the employee object itself represents the relation.
 
 Let's say further, for simplicity, that employee names are unique and
-can be used to represent employees.  We can use names as our "tokens". 
-Tokens are similar to the primary key in a relational database, or in
-intid or keyreference in Zope 3--some way to uniquely identify an
-object, which sorts reliably and can be resolved to the object given the
-right context.  For speed, integers make the best tokens; followed by other
+can be used to represent employees.  We can use names as our "tokens".
+
+Tokens are similar to the primary key in a relational database. A token is a
+way to identify an object. It must sort reliably and you must be able to write
+a callable that reliably resolves to the object given the right context. In
+Zope 3, intids (zope.app.intid) and keyreferences (zope.app.keyreference) are
+good examples of reasonable tokens.
+
+As we'll see below, you provide a way to convert objects to tokens, and resolve
+tokens to objects, for the relations, and for each value index individually.
+They can be the all the same functions or completely different, depending on
+your needs.
+
+For speed, integers make the best tokens; followed by other
 immutables like strings; followed by non-persistent objects; followed by
-persistent objects.
+persistent objects.  The choice also determines a choice of BTree module, as
+we'll see below.
 
-Here is our toy `Employee` example class.
+Here is our toy ``Employee`` example class.  Again, we will use the employee
+name as the tokens.
 
     >>> employees = {} # we'll use this to resolve the "name" tokens
     >>> class Employee(object):
@@ -69,13 +99,15 @@
     ...         return cmp(self.name, other.name)
     ...
 
-So, we need to define how to turn employees into their tokens.
+So, we need to define how to turn employees into their tokens.  We call the
+tokenization a "dump" function. Conversely, the function to resolve tokens into
+objects is called a "load".
 
-The function to tokenize, or "dump", gets the object to be tokenized;
-and, because it helps sometimes to provide context, the catalog; and a
-dictionary that will be shared for a given search.  The dictionary can
-be used as a cache for optimizations (for instance, to stash a utility
-that you looked up).
+Functions to dump relations and values get several arguments. The first
+argument is the object to be tokenized. Next, because it helps sometimes to
+provide context, is the catalog. The last argument is a dictionary that will be
+shared for a given search. The dictionary can be ignored, or used as a cache
+for optimizations (for instance, to stash a utility that you looked up).
 
 For this example, our function is trivial: we said the token would be
 the employee's name.
@@ -94,7 +126,7 @@
 context; and a dict cache, for optimizations of subsequent calls.
 
 You might have noticed in our Employee __init__ that we keep a mapping
-of name to object in the `employees` global dict (defined right above
+of name to object in the ``employees`` global dict (defined right above
 the class definition).  We'll use that for resolving the tokens.  
 
     >>> def loadEmployees(token, catalog, cache):
@@ -105,36 +137,57 @@
 by specifying how to tokenize relations, and what kind of BTree modules
 should be used to hold the tokens.
 
+How do you pick BTree modules?
+
+- If the tokens are 32-bit ints, choose BTrees.family32.II, BTrees.family32.IF
+  or BTrees.family32.IO.
+
+- If the tokens are 64 bit ints, choose BTrees.family64.II, BTrees.family64.IF
+  or BTrees.family64.IO.
+
+- If they are anything else, choose BTrees.family32.OI, BTrees.family64.OI, or
+  BTrees.family32.OO (or BTrees.family64.OO--they are the same).
+
+Within these rules, the choice is somewhat arbitrary unless you plan to merge
+these results with that of another source that is using a particular BTree
+module.  BTree set operations only work within the same module, so you must
+match module to module.
+
+In this example, our tokens are strings, so we want OO or an OI variant.  We'll
+choose BTrees.family32.OI, arbitrarily.
+
     >>> import zc.relation.catalog
     >>> import BTrees
     >>> catalog = zc.relation.catalog.Catalog(dumpEmployees, loadEmployees,
-    ...                               relFamily=BTrees.family32.OI)
+    ...                                       btree=BTrees.family32.OI)
 
-[#verifyObjectICatalog]_ [#legacy]_ We can't do very much searching with it so
-far though, because the catalog doesn't have any indexes. In this example, the
-relationship itself represents the employee, so we won't need to index that
-separately. But we do need a way to tell the catalog how to find the other end
-of the relationship, the supervisor.
+[#verifyObjectICatalog]_ [#legacy]_ Look!  A relation catalog!
 
-You can specify this to the catalog with a zope.interface attribute or
-method, or with a callable.  We'll use a callable for now.  It takes the
-indexed relationship and the catalog for context.
+We can't do very much searching with it so far though, because the catalog
+doesn't have any indexes. In this example, the relationship itself represents
+the employee, so we won't need to index that separately.
 
+But we do need a way to tell the catalog how to find the other end of the
+relationship, the supervisor. You can specify this to the catalog with an
+attribute or method specified from zope.interface Interface, or with a
+callable. We'll use a callable for now. The callable will receive the indexed
+relation and the catalog for context.
+
     >>> def supervisor(emp, catalog):
     ...     return emp.supervisor # None or another employee
     ...
 
-Then we'll use that to tell the catalog to add an index for `supervisor`.
+We'll also specify need to specify how to tokenize (dump and load) those
+values. In this case, we're able to use the same functions as the relations
+themselves. However, do note that we can specify a completely different way to
+dump and load for each "value index," or relation element.
 
-We'll also specify how to tokenize (dump and load) those values. In this case,
-we're able to use the same functions as the relations themselves. However, do
-note that we can specify a completely different way to dump and load for each
-"value index," or relation element.
-
 We could also specify the name to call the index, but it will default to the
 __name__ of the function (or interface element), which will work just fine for
 us now.
 
+Now we can add the "supervisor" value index.
+
     >>> catalog.addValueIndex(supervisor, dumpEmployees, loadEmployees,
     ...                       btree=BTrees.family32.OI)
 
@@ -172,20 +225,75 @@
       |
     Howie
 
-Let's tell the catalog about the relations.
+Let's tell the catalog about the relations, using the ``index`` method.
 
     >>> for emp in (a,b,c,d,e,f,g,h):
     ...     catalog.index(emp)
     ...
 
+We've now created the relation catalog and added relations to it.  We're
+ready to search!
+
 Searching
 =========
 
-Queries, `findRelations`, and special query values
---------------------------------------------------
+In this section, we will introduce the following ideas.
 
+- Queries to the relation catalog are formed with dicts.
+
+- Query keys are the names of the indexes you want to search, or, for the
+  special case of precise relations, the zc.relation.RELATION constant.
+
+- Query values are the tokens of the results you want to match; or None,
+  indicating relations that have None as a value (or an empty collection, if it
+  is a multiple). Search values can use zc.relation.catalog.any(\*args) or
+  zc.relation.catalog.Any(args) to specify multiple (non-None) results to match
+  for a given key.
+
+- The index has a variety of methods to help you work with tokens.
+  ``tokenizeQuery`` is typically the most used, though others are available.
+
+- To find relations that match a query, use ``findRelations`` or
+  ``findRelationTokens``.
+
+- To find values that match a query, use ``findValues`` or ``findValueTokens``.
+
+- You search transitively by using a query factory. The
+  zc.relation.queryfactory.TransposingTransitive is a good common case factory
+  that lets you walk up and down a hierarchy. A query factory can be passed in
+  as an argument to search methods as a ``queryFactory``, or installed as a
+  default behavior using ``addDefaultQueryFactory``.
+
+- To find how a query is related, use ``findRelationChains`` or
+  ``findRelationTokenChains``.
+
+- To find out if a query is related, use ``canFind``.
+
+- Circular transitive relations are handled to prevent infinite loops. They
+  are identified in ``findRelationChains`` and ``findRelationTokenChains`` with
+  a ``zc.relation.interfaces.ICircularRelationPath`` marker interface.
+
+- search methods share the following arguments:
+
+  * maxDepth, limiting the transitive depth for searches;
+  
+  * filter, allowing code to filter transitive paths;
+  
+  * targetQuery, allowing a query to filter transitive paths on the basis of
+    the endpoint;
+  
+  * targetFilter, allowing code to filter transitive paths on the basis of the
+    endpoint; and
+
+  * queryFactory, mentioned above.
+
+- You can set up search indexes to speed up specific transitive searches.
+
+Queries, ``findRelations``, and special query values
+----------------------------------------------------
+
 So who works for Alice?  That means we want to get the relations--the
-employees--with a `supervisor` of Alice.
+employees--with a ``supervisor`` of Alice.
 
 The heart of a question to the catalog is a query.  A query is spelled
 as a dictionary.  The main idea is simply that keys in a dictionary
@@ -209,7 +317,7 @@
 Alice is the only employee who doesn't report to anyone.
 
 What if you want to ask "who reports to Diane or Chuck?"  Then you use the
-zc.relation `Any` class or `any` function to pass the multiple values.
+zc.relation ``Any`` class or ``any`` function to pass the multiple values.
 
     >>> sorted(catalog.findRelations(
     ...     {'supervisor': zc.relation.catalog.any('Diane', 'Chuck')}))
@@ -217,27 +325,11 @@
     [<Employee instance "Frank">, <Employee instance "Galyn">,
      <Employee instance "Howie">]
 
-Frank, Galyn, and Howie each report to either Diane or Chuck.[#any]_
+Frank, Galyn, and Howie each report to either Diane or Chuck. [#any]_
 
-It is worth a quick mention here that the catalog always has parallel
-search methods, one for finding objects, as seen above, and one for
-finding tokens.  Finding tokens can be much more efficient, especially if
-the result from the relation catalog is just one step along the
-path of finding your desired result.  But finding objects is simpler for
-some common cases.  Here's a quick example of the same query, getting
-tokens rather than objects.
+``findValues`` and the ``RELATION`` query key
+---------------------------------------------
 
-    >>> sorted(catalog.findRelationTokens({'supervisor': 'Alice'}))
-    ['Betty', 'Chuck']
-    >>> sorted(catalog.findRelationTokens({'supervisor': None}))
-    ['Alice']
-    >>> sorted(catalog.findRelationTokens(
-    ...     {'supervisor': zc.relation.catalog.any('Diane', 'Chuck')}))
-    ['Frank', 'Galyn', 'Howie']
-
-`findValues` and the `RELATION` query key
------------------------------------------
-
 So how do we find who an employee's supervisor is?  Well, in this case,
 look at the attribute on the employee!  If you can use an attribute that
 will usually be a win in the ZODB.  
@@ -257,10 +349,10 @@
 important pair of search methods, and because it is a a stepping stone
 to our first transitive search.
 
-So, o relaton catalog, who is Howie's supervisor?  
+So, o relation catalog, who is Howie's supervisor?  
 
 To ask this question we want to get the indexed values off of the
-relations: `findValues`. In its simplest form, the arguments are the
+relations: ``findValues``. In its simplest form, the arguments are the
 index name of the values you want, and a query to find the relationships
 that have the desired values.
 
@@ -269,9 +361,9 @@
 search one or more indexes for matching relationships, as usual, but
 actually specify a relationship: Howie.
 
-We do not have a value index name: we are looking for a relation. The key is
-the constant `zc.relation.RELATION`. For our current example, that would mean
-the query is `{zc.relation.RELATION: 'Howie'}`.
+We do not have a value index name: we are looking for a relation. The query
+key, then, should be the constant ``zc.relation.RELATION``. For our current
+example, that would mean the query is ``{zc.relation.RELATION: 'Howie'}``.
 
     >>> import zc.relation
     >>> list(catalog.findValues(
@@ -279,26 +371,109 @@
     <Employee instance "Diane">
 
 Congratulations, you just found an obfuscated and comparitively
-inefficient way to write `howie.supervisor`! [#intrinsic_search]_
+inefficient way to write ``howie.supervisor``! [#intrinsic_search]_
 [#findValuesExceptions]_
 
 Slightly more usefully, you can use other query keys along with
 zc.relation.RELATION. This asks, "Of Betty, Alice, and Frank, who are
 supervised by Alice?"
     
-    >>> sorted(catalog.findRelationTokens(
+    >>> sorted(catalog.findRelations(
     ...     {zc.relation.RELATION: zc.relation.catalog.any(
     ...         'Betty', 'Alice', 'Frank'),
     ...      'supervisor': 'Alice'}))
-    ['Betty']
+    [<Employee instance "Betty">]
 
 Only Betty is.
 
-Transitive Searching, Query Factories, and `maxDepth`
-----------------------------------------------------------------
+Tokens
+------
 
-What about transitive searching?  Well, you need to tell the catalog how to
-walk the tree.  In simple (and very common) cases like this, the
+As mentioned above, the catalog provides several helpers to work with tokens.
+The most frequently used is ``tokenizeQuery``, which takes a query with
+object values and converts them to tokens using the "dump" functions registered
+for the relationships and indexed values.  Here are alternate spellings of some
+of the queries we've encountered above.
+
+    >>> catalog.tokenizeQuery({'supervisor': a})
+    {'supervisor': 'Alice'}
+    >>> catalog.tokenizeQuery({'supervisor': None})
+    {'supervisor': None}
+    >>> import pprint
+    >>> catalog.tokenizeQuery(
+    ...     {zc.relation.RELATION: zc.relation.catalog.any(a, b, f),
+    ...     'supervisor': a}) # doctest: +NORMALIZE_WHITESPACE
+    {None: <zc.relation.catalog.Any instance ('Alice', 'Betty', 'Frank')>,
+    'supervisor': 'Alice'}
+
+(If you are wondering about that ``None`` in the last result, yes,
+``zc.relation.RELATION`` is just readability sugar for ``None``.)
+
+So, here's a real search using ``tokenizeQuery``.  We'll make an alias for
+``catalog.tokenizeQuery`` just to shorten things up a bit.
+
+    >>> query = catalog.tokenizeQuery
+    >>> sorted(catalog.findRelations(query(
+    ...     {zc.relation.RELATION: zc.relation.catalog.any(a, b, f),
+    ...      'supervisor': a})))
+    [<Employee instance "Betty">]
+
+The catalog always has parallel search methods, one for finding objects, as
+seen above, and one for finding tokens (the only exception is ``canFind``,
+described below). Finding tokens can be much more efficient, especially if the
+result from the relation catalog is just one step along the path of finding
+your desired result. But finding objects is simpler for some common cases.
+Here's a quick example of some queries above, getting tokens rather than
+objects.
+
+You can also spell a query in tokenizeQuery with keyword arguments.  This
+won't work if your key is zc.relation.RELATION, but otherwise it can improve
+readability.  We'll see some examples of this below as well.
+
+    >>> sorted(catalog.findRelationTokens(query(supervisor=a)))
+    ['Betty', 'Chuck']
+
+    >>> sorted(catalog.findRelationTokens({'supervisor': None}))
+    ['Alice']
+
+    >>> sorted(catalog.findRelationTokens(
+    ...     query(supervisor=zc.relation.catalog.any(c, d))))
+    ['Frank', 'Galyn', 'Howie']
+
+    >>> sorted(catalog.findRelationTokens(
+    ...     query({zc.relation.RELATION: zc.relation.catalog.any(a, b, f),
+    ...            'supervisor': a})))
+    ['Betty']
+
+The catalog provides several other methods just for working with tokens.
+
+- ``resolveQuery``: the inverse of ``tokenizeQuery``, converting a
+  tokenizedquery to a query with objects.
+
+- ``tokenizeValues``: returns an iterable of tokens for the values of the given
+  index name.
+
+- ``resolveValueTokens``: returns an iterable of values for the tokens of the
+  given index name.
+
+- ``tokenizeRelation``: returns a token for the given relation.
+
+- ``resolveRelationToken``: returns a relation for the given token.
+
+- ``tokenizeRelations``: returns an iterable of tokens for the relations given.
+
+- ``resolveRelationTokens``: returns an iterable of relations for the tokens
+  given.
+
+These methods are lesser used, and described in more technical documents in
+this package.
+
+Transitive Searching, Query Factories, and ``maxDepth``
+-------------------------------------------------------
+
+So, we've seen a lot of one-level, intransitive searching. What about
+transitive searching? Well, you need to tell the catalog how to walk the tree.
+In simple (and very common) cases like this, the
 zc.relation.queryfactory.TransposingTransitive will do the trick.
 
 A transitive query factory is just a callable that the catalog uses to
@@ -320,7 +495,7 @@
     >>> factory = zc.relation.queryfactory.TransposingTransitive(
     ...     zc.relation.RELATION, 'supervisor')
 
-Now `factory` is just a callable.  Let's let it help answer a couple of
+Now ``factory`` is just a callable.  Let's let it help answer a couple of
 questions.
 
 Who are all of Howie's supervisors transitively (this looks up in the
@@ -348,7 +523,7 @@
 This transitive factory is really the only transitive factory you would
 want for this particular catalog, so it probably is safe to wire it in
 as a default.  You can add multiple query factories to match different
-queries using `addDefaultQueryFactory`.
+queries using ``addDefaultQueryFactory``.
 
     >>> catalog.addDefaultQueryFactory(factory)
 
@@ -376,22 +551,22 @@
 [#maxDepthExceptions]_ We'll introduce some other available search
 arguments later in this document and in other documents.  It's important
 to note that *all search methods share the same arguments as
-`findRelations`*.  `findValues` and `findValueTokens` only add the
+``findRelations``*.  ``findValues`` and ``findValueTokens`` only add the
 initial argument of specifying the desired value.
 
-We've looked at two search methods so far: the `findValues` and
-`findRelations` methods help you ask what is related.  But what if you
+We've looked at two search methods so far: the ``findValues`` and
+``findRelations`` methods help you ask what is related.  But what if you
 want to know *how* things are transitively related?
 
-`findRelationshipChains` and `targetQuery`
-------------------------------------------
+``findRelationshipChains`` and ``targetQuery``
+----------------------------------------------
 
-Another search method, `findRelationChains`, helps you discover how
+Another search method, ``findRelationChains``, helps you discover how
 things are transitively related.  
 
 The method name says "find relation chains".  But what is a "relation
 chain"?  In this API, it is a transitive path of relations.  For
-instance, what's the chain of command above Howie?  `findRelationChains`
+instance, what's the chain of command above Howie?  ``findRelationChains``
 will return each unique path.
 
     >>> list(catalog.findRelationChains({zc.relation.RELATION: 'Howie'}))
@@ -428,7 +603,7 @@
 
 But what if we wanted to just find the paths from one query result to
 another query result--say, we wanted to know the chain of command from Alice
-down to Howie?  Then we can specify a `targetQuery` that specifies the
+down to Howie?  Then we can specify a ``targetQuery`` that specifies the
 characteristics of our desired end point (or points).
 
     >>> list(catalog.findRelationChains(
@@ -440,19 +615,19 @@
 
 So, Betty supervises Diane, who supervises Howie.
 
-Note that `targetQuery` now joins `maxDepth` in our collection of shared
+Note that ``targetQuery`` now joins ``maxDepth`` in our collection of shared
 search arguments that we have introduced.
 
-`filter` and `targetFilter`
----------------------------
+``filter`` and ``targetFilter``
+-------------------------------
 
 We can take a quick look now at the last of the two shared search arguments:
 filter and targetFilter.  These two are similar in that they both are
 callables that can approve or reject given relations in a search based on
-whatever logic you can code.  They differ in that `filter` stops any further
-transitive searches from the relation, while `targetFilter` merely omits the
-given result but allows further search from it.  Like `targetQuery`, then,
-`targetFilter` is good when you want to specify the other end of a path.
+whatever logic you can code.  They differ in that ``filter`` stops any further
+transitive searches from the relation, while ``targetFilter`` merely omits the
+given result but allows further search from it.  Like ``targetQuery``, then,
+``targetFilter`` is good when you want to specify the other end of a path.
 
 As an example, let's say we only want to return female employees.
 
@@ -462,7 +637,7 @@
     ...
 
 Here are all the female employees supervised by Alice transitively, using
-`targetFilter`.
+``targetFilter``.
 
     >>> list(catalog.findRelations({'supervisor': 'Alice'},
     ...                            targetFilter=female_filter))
@@ -490,10 +665,10 @@
 --------------
 
 Without setting up any additional indexes, the transitive behavior of
-the `findRelations` and `findValues` methods essentially relies on the
-brute force searches of `findRelationChains`.  Results are iterables
+the ``findRelations`` and ``findValues`` methods essentially relies on the
+brute force searches of ``findRelationChains``.  Results are iterables
 that are gradually computed.  For instance, let's repeat the question
-"Whom does Betty supervise?".  Notice that `res` first populates a list
+"Whom does Betty supervise?".  Notice that ``res`` first populates a list
 with three members, but then does not populate a second list.  The
 iterator has been exhausted.
 
@@ -508,9 +683,9 @@
 sometimes speed for these searches is critical.  In these cases, you can
 add a "search index".  A search index speeds up the result of one or
 more precise searches by indexing the results.  Search indexes can
-affect the results of searches with a queryFactory in `findRelations`,
-`findValues`, and the soon-to-be-introduced `canFind`, but they do not
-affect `findRelationChains`.
+affect the results of searches with a queryFactory in ``findRelations``,
+``findValues``, and the soon-to-be-introduced ``canFind``, but they do not
+affect ``findRelationChains``.
 
 The zc.relation package currently includes two kinds of search indexes,
 one for indexing relation searches and one for indexing value searches.
@@ -523,7 +698,7 @@
 
     >>> import zc.relation.searchindex
     >>> catalog.addSearchIndex(
-    ...     zc.relation.searchindex.TransposingTransitive(
+    ...     zc.relation.searchindex.TransposingTransitiveMembership(
     ...         'supervisor', zc.relation.RELATION))
 
 The ``zc.relation.RELATION`` describes how to walk back up the chain. Search
@@ -571,7 +746,7 @@
     >>> a.supervisor = z
 
 Now we have a cycle.  Of course, we have not yet told the catalog about it.
-`index` can be used both to reindex Alice and index Zane.
+``index`` can be used both to reindex Alice and index Zane.
 
     >>> catalog.index(a)
     >>> catalog.index(z)
@@ -590,7 +765,7 @@
     ...     'supervisor', {zc.relation.RELATION: 'Frank'}))
     ['Chuck', 'Alice', 'Zane', 'Betty']
 
-Paths returned by `findRelationChains` are marked with special interfaces, and
+Paths returned by ``findRelationChains`` are marked with special interfaces, and
 special metadata, to show the chain.
 
     >>> res = list(catalog.findRelationChains({zc.relation.RELATION: 'Frank'}))
@@ -636,16 +811,16 @@
     >>> sorted(catalog.findRelationTokens({'supervisor': 'Betty'}))
     ['Diane', 'Edgar', 'Howie']
 
-`canFind`
----------
+``canFind``
+-----------
 
-We're to the last search method: `canFind`.  We've gotten values and
+We're to the last search method: ``canFind``.  We've gotten values and
 relations, but what if you simply want to know if there is any
 connection at all?  For instance, is Alice a supervisor of Howie? Is
-Chuck?  To answer these questions, you can use the `canFind` method
-combined with the `targetQuery` search argument.
+Chuck?  To answer these questions, you can use the ``canFind`` method
+combined with the ``targetQuery`` search argument.
 
-The `canFind` method takes the same arguments as findRelations.  However,
+The ``canFind`` method takes the same arguments as findRelations.  However,
 it simply returns a boolean about whether the search has any results.  This
 is a convenience that also allows some extra optimizations.
 
@@ -659,14 +834,14 @@
     >>> catalog.canFind({'supervisor': 'Howie'})
     False
 
-What about...Zane (not an employee)?
+What about...Zane (no longer an employee)?
 
     >>> catalog.canFind({'supervisor': 'Zane'})
     False
 
 If we want to know if Alice or Chuck supervise Howie, then we want to specify
 characteristics of two points on a path.  To ask a question about the other
-end of a path, use `targetQuery`.
+end of a path, use ``targetQuery``.
 
 Is Alice a supervisor of Howie?
 
@@ -692,10 +867,599 @@
     ...                 targetQuery={'supervisor': 'Chuck'})
     False
 
-(Note that, if your relations describe a hierarchy, searching up a
-hierarchy is usually more efficient, so the second pair of questions is
+(Note that, if your relations describe a hierarchy, searching up a hierarchy is
+usually more efficient than searching down, so the second pair of questions is
 generally preferable to the first in that case.)
 
+Working with More Complex Relations
+===================================
+
+So far, our examples have used a simple relation, in which the indexed object
+is one end of the relation, and the indexed value on the object is the other.
+This example has let us look at all of the basic zc.relation catalog
+functionality available without .
+
+As mentioned in the introduction, though, the catalog supports, and was
+designed for, more complex relations.  This section will quickly examine a
+few examples of other uses.
+
+In this section, we will see several examples of ideas mentioned above but not
+yet demonstrated.
+
+- We can use interface attributes (values or callables) to define value
+  indexes.
+
+- Using interface attributes will cause an attempt to adapt the relation if it
+  does not already provide the interface.
+
+- We can use the ``multiple`` argument when defining a value index to indicate
+  that the indexed value is a collection.
+
+- We can use the ``name`` argument when defining a value index to specify the
+  name to be used in queries, rather than relying on the name of the interface
+  attribute or callable.
+
+- The ``family`` argument in instantiating the catalog lets you change the
+  default btree family for relations and value indexes from BTrees.family32.IF
+  to BTrees.family64.IF.
+
+Extrinsic Two-Way Relationship
+------------------------------
+
+A simple variation of our current story is this: what if the indexed relation
+were between two other objects--that is, what if the relation were extrinsic to
+both participants?
+
+Let's imagine we have relations that show biological parentage.  We'll want
+a "Person" and a "Parentage" relation. We'll define an interface for IParentage
+so we can see how using an interface to define a value index works.
+
+    >>> class Person(object):
+    ...     def __init__(self, name):
+    ...         self.name = name
+    ...     def __repr__(self):
+    ...         return '<Person %r>' % (self.name,)
+    ...
+    >>> import zope.interface
+    >>> class IParentage(zope.interface.Interface):
+    ...     child = zope.interface.Attribute('the child')
+    ...     parents = zope.interface.Attribute('the parents')
+    ...
+    >>> class Parentage(object):
+    ...     zope.interface.implements(IParentage)
+    ...     def __init__(self, child, parent1, parent2):
+    ...         self.child = child
+    ...         self.parents = (parent1, parent2)
+    ...
+
+Now we'll define the dumpers and loaders and then the catalog.  Notice that
+we are relying on a pattern: the dump must be called before the load.
+
+    >>> _people = {}
+    >>> _relations = {}
+    >>> def dumpPeople(obj, catalog, cache):
+    ...     if _people.setdefault(obj.name, obj) is not obj:
+    ...         raise ValueError('we are assuming names are unique')
+    ...     return obj.name
+    ...
+    >>> def loadPeople(token, catalog, cache):
+    ...     return _people[token]
+    ...
+    >>> def dumpRelations(obj, catalog, cache):
+    ...     if _relations.setdefault(id(obj), obj) is not obj:
+    ...         raise ValueError('huh?')
+    ...     return id(obj)
+    ...
+    >>> def loadRelations(token, catalog, cache):
+    ...     return _relations[token]
+    ...
+    >>> catalog = zc.relation.catalog.Catalog(dumpRelations, loadRelations)
+    >>> catalog.addValueIndex(IParentage['child'], dumpPeople, loadPeople,
+    ...                       btree=BTrees.family32.OO)
+    >>> catalog.addValueIndex(IParentage['parents'], dumpPeople, loadPeople,
+    ...                       btree=BTrees.family32.OO, multiple=True,
+    ...                       name='parent')
+    >>> catalog.addDefaultQueryFactory(
+    ...     zc.relation.queryfactory.TransposingTransitive(
+    ...         'child', 'parent'))
+
+Now we have a catalog fully set up.  Let's add some relations.
+
+    >>> a = Person('Alice')
+    >>> b = Person('Betty')
+    >>> c = Person('Charles')
+    >>> d = Person('Donald')
+    >>> e = Person('Eugenia')
+    >>> f = Person('Fred')
+    >>> g = Person('Gertrude')
+    >>> h = Person('Harry')
+    >>> i = Person('Iphigenia')
+    >>> j = Person('Jacob')
+    >>> k = Person('Karyn')
+    >>> l = Person('Lee')
+
+    >>> r1 = Parentage(child=j, parent1=k, parent2=l)
+    >>> r2 = Parentage(child=g, parent1=i, parent2=j)
+    >>> r3 = Parentage(child=f, parent1=g, parent2=h)
+    >>> r4 = Parentage(child=e, parent1=g, parent2=h)
+    >>> r5 = Parentage(child=b, parent1=e, parent2=d)
+    >>> r6 = Parentage(child=a, parent1=e, parent2=c)
+
+Here's that in one of our hierarchy diagrams.
+
+::
+
+    Karyn   Lee
+         \ /
+        Jacob   Iphigenia
+             \ /
+            Gertrude    Harry
+                    \  /
+                 /-------\
+             Fred        Eugenia
+               Donald   /     \    Charles
+                     \ /       \  /
+                    Betty      Alice
+
+Now we can index the relations, and ask some questions.
+
+    >>> for r in (r1, r2, r3, r4, r5, r6):
+    ...     catalog.index(r)
+    >>> query = catalog.tokenizeQuery
+    >>> sorted(catalog.findValueTokens(
+    ...     'parent', query(child=a), maxDepth=1))
+    ['Charles', 'Eugenia']
+    >>> sorted(catalog.findValueTokens('parent', query(child=g)))
+    ['Iphigenia', 'Jacob', 'Karyn', 'Lee']
+    >>> sorted(catalog.findValueTokens(
+    ...     'child', query(parent=h), maxDepth=1))
+    ['Eugenia', 'Fred']
+    >>> sorted(catalog.findValueTokens('child', query(parent=h)))
+    ['Alice', 'Betty', 'Eugenia', 'Fred']
+    >>> catalog.canFind(query(parent=h), targetQuery=query(child=d))
+    False
+    >>> catalog.canFind(query(parent=l), targetQuery=query(child=b))
+    True
+
+Multi-Way Relations
+-------------------
+
+The previous example quickly showed how to set the catalog up for a completely
+extrinsic two-way relation.  The same pattern can be extended for N-way
+relations.  For example, consider a four way relation in the form of
+SUBJECTS PREDICATE OBJECTS [in CONTEXT].  For instance, we might
+want to say "(joe,) SELLS (doughnuts, coffee) in corner_store", where "(joe,)"
+is the collection of subjects, "SELLS" is the predicate, "(doughnuts, coffee)"
+is the collection of objects, and "corner_store" is the optional context.
+
+For this last example, we'll integrate two components we haven't seen examples
+of here before: the ZODB and adaptation.
+
+Our example ZODB approach uses OIDs as the tokens. this might be OK in some
+cases, if you will never support multiple databases and you don't need an
+abstraction layer so that a different object can have the same identifier.
+
+    >>> import persistent
+    >>> import struct
+    >>> class Demo(persistent.Persistent):
+    ...     def __init__(self, name):
+    ...         self.name = name
+    ...     def __repr__(self):
+    ...         return '<Demo instance %r>' % (self.name,)
+    ...
+    >>> class IRelation(zope.interface.Interface):
+    ...     subjects = zope.interface.Attribute('subjects')
+    ...     predicate = zope.interface.Attribute('predicate')
+    ...     objects = zope.interface.Attribute('objects')
+    ...
+    >>> class IContextual(zope.interface.Interface):
+    ...     def getContext():
+    ...         'return context'
+    ...     def setContext(value):
+    ...         'set context'
+    ...
+    >>> class Contextual(object):
+    ...     zope.interface.implements(IContextual)
+    ...     _context = None
+    ...     def getContext(self):
+    ...         return self._context
+    ...     def setContext(self, value):
+    ...         self._context = value
+    ...
+    >>> class Relation(persistent.Persistent):
+    ...     zope.interface.implements(IRelation)
+    ...     def __init__(self, subjects, predicate, objects):
+    ...         self.subjects = subjects
+    ...         self.predicate = predicate
+    ...         self.objects = objects
+    ...         self._contextual = Contextual()
+    ...         
+    ...     def __conform__(self, iface):
+    ...         if iface is IContextual:
+    ...             return self._contextual
+    ...
+
+(When using zope.component, the __conform__ would normally be unnecessary;
+however, this package does not depend on zope.component.)
+
+    >>> def dumpPersistent(obj, catalog, cache):
+    ...     if obj._p_jar is None:
+    ...         catalog._p_jar.add(obj) # assumes something else places it
+    ...     return struct.unpack('<q', obj._p_oid)[0]
+    ...
+    >>> def loadPersistent(token, catalog, cache):
+    ...     return catalog._p_jar.get(struct.pack('<q', token))
+    ...
+
+    >>> from ZODB.tests.util import DB
+    >>> db = DB()
+    >>> conn = db.open()
+    >>> root = conn.root()
+    >>> catalog = root['catalog'] = zc.relation.catalog.Catalog(
+    ...     dumpPersistent, loadPersistent, family=BTrees.family64)
+    >>> catalog.addValueIndex(IRelation['subjects'],
+    ...     dumpPersistent, loadPersistent, multiple=True, name='subject')
+    >>> catalog.addValueIndex(IRelation['objects'],
+    ...     dumpPersistent, loadPersistent, multiple=True, name='object')
+    >>> catalog.addValueIndex(IRelation['predicate'], btree=BTrees.family32.OO)
+    >>> catalog.addValueIndex(IContextual['getContext'],
+    ...     dumpPersistent, loadPersistent, name='context')
+    >>> import transaction
+    >>> transaction.commit()
+
+The dumpPersistent and loadPersistent is a bit of a toy, as warned above.
+Also, while our predicate will be stored as a string, some programmers may
+prefer to have a dump in such a case verify that the string has been explicitly
+registered in some way, to prevent typos.  Obviously, we are not bothering
+with this for our example.
+
+We make some objects, and then we make some relations with those objects and
+index them.
+
+    >>> joe = root['joe'] = Demo('joe')
+    >>> sara = root['sara'] = Demo('sara')
+    >>> jack = root['jack'] = Demo('jack')
+    >>> ann = root['ann'] = Demo('ann')
+    >>> doughnuts = root['doughnuts'] = Demo('doughnuts')
+    >>> coffee = root['coffee'] = Demo('coffee')
+    >>> muffins = root['muffins'] = Demo('muffins')
+    >>> cookies = root['cookies'] = Demo('cookies')
+    >>> newspaper = root['newspaper'] = Demo('newspaper')
+    >>> corner_store = root['corner_store'] = Demo('corner_store')
+    >>> bistro = root['bistro'] = Demo('bistro')
+    >>> bakery = root['bakery'] = Demo('bakery')
+
+    >>> SELLS = 'SELLS'
+    >>> BUYS = 'BUYS'
+    >>> OBSERVES = 'OBSERVES'
+
+    >>> rel1 = root['rel1'] = Relation((joe,), SELLS, (doughnuts, coffee))
+    >>> IContextual(rel1).setContext(corner_store)
+    >>> rel2 = root['rel2'] = Relation((sara, jack), SELLS,
+    ...                                (muffins, doughnuts, cookies))
+    >>> IContextual(rel2).setContext(bakery)
+    >>> rel3 = root['rel3'] = Relation((ann,), BUYS, (doughnuts,))
+    >>> rel4 = root['rel4'] = Relation((sara,), BUYS, (bistro,))
+    
+    >>> for r in (rel1, rel2, rel3, rel4):
+    ...     catalog.index(r)
+    ...
+
+Now we can ask a simple question.  Where do they sell doughnuts?
+
+    >>> query = catalog.tokenizeQuery
+    >>> sorted(catalog.findValues(
+    ...     'context',
+    ...     (query(predicate=SELLS, object=doughnuts))),
+    ...     key=lambda ob: ob.name)
+    [<Demo instance 'bakery'>, <Demo instance 'corner_store'>]
+
+Hopefully these examples give you further ideas on how you can use this tool.
+
+Additional Functionality
+========================
+
+This section introduces peripheral functionality.  We will learn the following.
+
+- Listeners can be registered in the catalog.  They are alerted when a relation
+  is added, modified, or removed; and when the catalog is cleared and copied
+  (see below).
+
+- The ``clear`` method clears the relations in the catalog.
+
+- The ``copy`` method makes a copy of the current catalog by copying internal
+  data structures, rather than reindexing the relations, which can be a
+  significant optimization opportunity.  This copies value indexes and search
+  indexes; and gives listeners an opportunity to specify what, if anything,
+  should be included in the new copy.
+
+- The ``ignoreSearchIndex`` argument to the five pertinent search methods
+  causes the search to ignore search indexes, even if there is an appropriate
+  one.
+
+Listeners
+---------
+
+A variety of potential clients may want to be alerted when the catalog changes.
+zc.relation does not depend on zope.event, so listeners may be registered for
+various changes.  Let's make a quick demo listener.  The ``additions`` and
+``removals`` arguments are dictionaries of {value name: iterable of added or
+removed value tokens}.
+
+    >>> def pchange(d):
+    ...     pprint.pprint(dict(
+    ...         (k, v is not None and set(v) or v) for k, v in d.items()))
+    >>> class DemoListener(persistent.Persistent):
+    ...     zope.interface.implements(zc.relation.interfaces.IListener)
+    ...     def relationAdded(self, token, catalog, additions):
+    ...         print ('a relation (token %r) was added to %r '
+    ...                'with these values:' % (token, catalog))
+    ...         pchange(additions)
+    ...     def relationModified(self, token, catalog, additions, removals):
+    ...         print ('a relation (token %r) in %r was modified '
+    ...                'with these additions:' % (token, catalog))
+    ...         pchange(additions)
+    ...         print 'and these removals:'
+    ...         pchange(removals)
+    ...     def relationRemoved(self, token, catalog, removals):
+    ...         print ('a relation (token %r) was removed from %r '
+    ...                'with these values:' % (token, catalog))
+    ...         pchange(removals)
+    ...     def sourceCleared(self, catalog):
+    ...         print 'catalog %r had all relations unindexed' % (catalog,)
+    ...     def sourceAdded(self, catalog):
+    ...         print 'now listening to catalog %r' % (catalog,)
+    ...     def sourceRemoved(self, catalog):
+    ...         print 'no longer listening to catalog %r' % (catalog,)
+    ...     def sourceCopied(self, original, copy):
+    ...         print 'catalog %r made a copy %r' % (catalog, copy)
+    ...         copy.addListener(self)
+    ...
+
+Listeners can be installed multiple times.
+
+Listeners can be added as persistent weak references, so that, if they are
+deleted elsewhere, a ZODB pack will not consider the reference in the catalog
+to be something preventing garbage collection.
+
+We'll install one of these demo listeners into our new catalog as a
+normal reference, the default behavior.  Then we'll show some example messages
+sent to the demo listener.
+
+    >>> listener = DemoListener()
+    >>> catalog.addListener(listener) # doctest: +ELLIPSIS
+    now listening to catalog <zc.relation.catalog.Catalog object at ...>
+    >>> rel5 = root['rel5'] = Relation((ann,), OBSERVES, (newspaper,))
+    >>> catalog.index(rel5) # doctest: +ELLIPSIS
+    a relation (token ...) was added to <...Catalog...> with these values:
+    {'context': None,
+     'object': set([...]),
+     'predicate': set(['OBSERVES']),
+     'subject': set([...])}
+    >>> rel5.subjects = (jack,)
+    >>> IContextual(rel5).setContext(bistro)
+    >>> catalog.index(rel5) # doctest: +ELLIPSIS
+    a relation (token ...) in ...Catalog... was modified with these additions:
+    {'context': set([...]),
+     'subject': set([...])}
+    and these removals:
+    {'subject': set([...])}
+    >>> catalog.unindex(rel5) # doctest: +ELLIPSIS
+    a relation (token ...) was removed from <...Catalog...> with these values:
+    {'context': set([...]),
+     'object': set([...]),
+     'predicate': set(['OBSERVES']),
+     'subject': set([...])}
+
+    >>> catalog.removeListener(listener) # doctest: +ELLIPSIS
+    no longer listening to catalog <...Catalog...>
+    >>> catalog.index(rel5) # doctest: +ELLIPSIS
+
+The only two methods not shown by those examples are ``sourceCleared`` and
+``sourceCopied``.  We'll get to those very soon below.
+
+The ``clear`` Method
+--------------------
+
+The ``clear`` method simply indexes all relations from a catalog.  Installed
+listeners have ``sourceCleared`` called.
+
+    >>> len(catalog)
+    5
+
+    >>> catalog.addListener(listener) # doctest: +ELLIPSIS
+    now listening to catalog <zc.relation.catalog.Catalog object at ...>
+
+    >>> catalog.clear() # doctest: +ELLIPSIS
+    catalog <...Catalog...> had all relations unindexed
+
+    >>> len(catalog)
+    0
+    >>> sorted(catalog.findValues(
+    ...     'context',
+    ...     (query(predicate=SELLS, object=doughnuts))),
+    ...     key=lambda ob: ob.name)
+    []
+
+The ``copy`` Method
+-------------------
+
+Sometimes you may want to copy a relation catalog.  One way of doing this is
+to create a new catalog, set it up like the current one, and then reindex
+all the same relations.  This is unnecessarily slow for programmer and
+computer.  The ``copy`` method makes a new catalog with the same corpus of
+indexed relations by copying internal data structures.
+
+Search indexes are requested to make new copies of themselves for the new
+catalog; and listeners are given an opportunity to react as desired to the new
+copy, including installing themselves, and/or another object of their choosing
+as a listener.
+
+Let's make a copy of a populated index with a search index and a listener.
+Notice in our listener that ``sourceCopied`` adds itself as a listener to the
+new copy. This is done at the very end of the ``copy`` process.
+
+    >>> for r in (rel1, rel2, rel3, rel4, rel5):
+    ...     catalog.index(r)
+    ... # doctest: +ELLIPSIS
+    a relation ... was added...
+    a relation ... was added...
+    a relation ... was added...
+    a relation ... was added...
+    a relation ... was added...
+    >>> BEGAT = 'BEGAT'
+    >>> rel6 = root['rel6'] = Relation((jack, ann), BEGAT, (sara,))
+    >>> henry = root['henry'] = Demo('henry')
+    >>> rel7 = root['rel7'] = Relation((sara, joe), BEGAT, (henry,))
+    >>> catalog.index(rel6) # doctest: +ELLIPSIS
+    a relation (token ...) was added to <...Catalog...> with these values:
+    {'context': None,
+     'object': set([...]),
+     'predicate': set(['BEGAT']),
+     'subject': set([..., ...])}
+    >>> catalog.index(rel7) # doctest: +ELLIPSIS
+    a relation (token ...) was added to <...Catalog...> with these values:
+    {'context': None,
+     'object': set([...]),
+     'predicate': set(['BEGAT']),
+     'subject': set([..., ...])}
+    >>> catalog.addDefaultQueryFactory(
+    ...     zc.relation.queryfactory.TransposingTransitive(
+    ...         'subject', 'object', {'predicate': BEGAT}))
+    ...
+    >>> list(catalog.findValues(
+    ...     'object', query(subject=jack, predicate=BEGAT)))
+    [<Demo instance 'sara'>, <Demo instance 'henry'>]
+    >>> catalog.addSearchIndex(
+    ...     zc.relation.searchindex.TransposingTransitiveMembership(
+    ...         'subject', 'object', static={'predicate': BEGAT}))
+    >>> sorted(
+    ...     catalog.findValues(
+    ...         'object', query(subject=jack, predicate=BEGAT)),
+    ...     key=lambda o: o.name)
+    [<Demo instance 'henry'>, <Demo instance 'sara'>]
+
+    >>> newcat = root['newcat'] = catalog.copy() # doctest: +ELLIPSIS
+    catalog <...Catalog...> made a copy <...Catalog...>
+    now listening to catalog <...Catalog...>
+    >>> transaction.commit()
+
+Now the copy has its own copies of internal data structures and of the
+searchindex.  For example, let's modify the relations and add a new one to the
+copy.
+
+    >>> mary = root['mary'] = Demo('mary')
+    >>> buffy = root['buffy'] = Demo('buffy')
+    >>> zack = root['zack'] = Demo('zack')
+    >>> rel7.objects += (mary,)
+    >>> rel8 = root['rel8'] = Relation((henry, buffy), BEGAT, (zack,))
+    >>> newcat.index(rel7) # doctest: +ELLIPSIS
+    a relation (token ...) in ...Catalog... was modified with these additions:
+    {'object': set([...])}
+    and these removals:
+    {}
+    >>> newcat.index(rel8) # doctest: +ELLIPSIS
+    a relation (token ...) was added to ...Catalog... with these values:
+    {'context': None,
+     'object': set([...]),
+     'predicate': set(['BEGAT']),
+     'subject': set([..., ...])}
+    >>> len(newcat)
+    8
+    >>> sorted(
+    ...     newcat.findValues(
+    ...         'object', query(subject=jack, predicate=BEGAT)),
+    ...     key=lambda o: o.name) # doctest: +NORMALIZE_WHITESPACE
+    [<Demo instance 'henry'>, <Demo instance 'mary'>, <Demo instance 'sara'>,
+     <Demo instance 'zack'>]
+    >>> sorted(
+    ...     newcat.findValues(
+    ...         'object', query(subject=sara)),
+    ...     key=lambda o: o.name) # doctest: +NORMALIZE_WHITESPACE
+    [<Demo instance 'bistro'>, <Demo instance 'cookies'>,
+    <Demo instance 'doughnuts'>, <Demo instance 'henry'>,
+    <Demo instance 'mary'>, <Demo instance 'muffins'>]
+
+The original catalog is not modified.
+
+    >>> len(catalog)
+    7
+    >>> sorted(
+    ...     catalog.findValues(
+    ...         'object', query(subject=jack, predicate=BEGAT)),
+    ...     key=lambda o: o.name)
+    [<Demo instance 'henry'>, <Demo instance 'sara'>]
+    >>> sorted(
+    ...     catalog.findValues(
+    ...         'object', query(subject=sara)),
+    ...     key=lambda o: o.name) # doctest: +NORMALIZE_WHITESPACE
+    [<Demo instance 'bistro'>, <Demo instance 'cookies'>,
+     <Demo instance 'doughnuts'>, <Demo instance 'henry'>,
+     <Demo instance 'muffins'>]
+
+The ``ignoreSearchIndex`` argument
+----------------------------------
+
+The five methods that can use search indexes, ``findValues``,
+``findValueTokens``, ``findRelations``, ``findRelationTokens``, and
+``canFind``, can be explictly requested to ignore any pertinent search index
+using the ``ignoreSearchIndex`` argument.
+
+We can see this easily with the token-related methods: the search index result
+will be a BTree set, while without the search index the result will be a
+generator.
+
+    >>> res1 = newcat.findValueTokens(
+    ...     'object', query(subject=jack, predicate=BEGAT))
+    >>> res1 # doctest: +ELLIPSIS
+    LFSet([..., ..., ..., ...])
+    >>> res2 = newcat.findValueTokens(
+    ...     'object', query(subject=jack, predicate=BEGAT),
+    ...     ignoreSearchIndex=True)
+    >>> res2 # doctest: +ELLIPSIS
+    <generator object at 0x...>
+    >>> sorted(res2) == list(res1)
+    True
+
+    >>> res1 = newcat.findRelationTokens(
+    ...     query(subject=jack, predicate=BEGAT))
+    >>> res1 # doctest: +ELLIPSIS
+    LFSet([..., ..., ...])
+    >>> res2 = newcat.findRelationTokens(
+    ...     query(subject=jack, predicate=BEGAT), ignoreSearchIndex=True)
+    >>> res2 # doctest: +ELLIPSIS
+    <generator object at 0x...>
+    >>> sorted(res2) == list(res1)
+    True
+
+We can see that the other methods take the argument, but the results look the
+same as usual.
+
+    >>> res = newcat.findValues(
+    ...     'object', query(subject=jack, predicate=BEGAT),
+    ...     ignoreSearchIndex=True)
+    >>> res # doctest: +ELLIPSIS
+    <generator object at 0x...>
+    >>> list(res) == list(newcat.resolveValueTokens(newcat.findValueTokens(
+    ...     'object', query(subject=jack, predicate=BEGAT),
+    ...     ignoreSearchIndex=True), 'object'))
+    True
+
+    >>> res = newcat.findRelations(
+    ...     query(subject=jack, predicate=BEGAT),
+    ...     ignoreSearchIndex=True)
+    >>> res # doctest: +ELLIPSIS
+    <generator object at 0x...>
+    >>> list(res) == list(newcat.resolveRelationTokens(
+    ...     newcat.findRelationTokens(
+    ...         query(subject=jack, predicate=BEGAT),
+    ...         ignoreSearchIndex=True)))
+    True
+
+    >>> newcat.canFind(
+    ...     query(subject=jack, predicate=BEGAT), ignoreSearchIndex=True)
+    True
+
 Conclusion
 ==========
 
@@ -705,62 +1469,132 @@
 That brings us to the end of our introductory example.  Let's review, and
 then look at where you can go from here.
 
-- The relation catalog indexes relations.  The relations can be one-way,
-  as we've seen here, with the employee relation pointing to the supervisor.
-  They can also be two-way, three-way, or N-way, as long as you tell the
-  catalog to index the different values.
+* Relations are objects with indexed values.
 
-- Relations and their values are stored as tokens.  Integers are the most
-  efficient tokens, but others can work find too.  The index has methods to
-  help you work with tokens, but we did not explore them here.
+* The relation catalog indexes relations. The relations can be one-way,
+  two-way, three-way, or N-way, as long as you tell the catalog to index the
+  different values.
 
-- Relations are indexed with `index`.  We didn't look at this, but relations
-  do not have to have all indexed values, which means they can be a
-  heterogeneous set of relations, allowing indexing of interesting data
-  structures.
+* Creating a catalog:
+    
+    - Relations and their values are stored in the catalog as tokens: unique
+      identifiers that you can resolve back to the original value. Integers are
+      the most efficient tokens, but others can work fine too.
+    
+    - Token type determines the BTree module needed.
+    
+        - If the tokens are 32-bit ints, choose BTrees.family32.II,
+          BTrees.family32.IF or BTrees.family32.IO.
+        
+        - If the tokens are 64 bit ints, choose BTrees.family64.II,
+          BTrees.family64.IF or BTrees.family64.IO.
+        
+        - If they are anything else, choose BTrees.family32.OI,
+          BTrees.family64.OI, or BTrees.family32.OO (or
+          BTrees.family64.OO--they are the same).
+        
+      Within these rules, the choice is somewhat arbitrary unless you plan to
+      merge these results with that of another source that is using a
+      particular BTree module. BTree set operations only work within the same
+      module, so you must match module to module.
 
-- You add value indexes to relation catalogs to be able to search.  Values
-  can be identified with callables (which we saw) or interface elements
-  (which we did not see).
+    - The ``family`` argument in instantiating the catalog lets you change the
+      default btree family for relations and value indexes from
+      BTrees.family32.IF to BTrees.family64.IF.
 
-- As we've seen here, relations are assumed to be between single values.
-  However, they do not have to be, as can be seen elsewhere (hint: use
-  the `multiple` argument in `addValueIndex`).
+    - You must define your own functions for tokenizing and resolving tokens.
+      These functions are registered with the catalog for the relations and for
+      each of their value indexes.
 
-- You search transitively by using a query factory.  The
-  zc.relation.queryfactory.TransposingTransitive is a good common case
-  factory that lets you walk up and down a hierarchy.  Query factories can
-  do other tricks too, which we did not see.
+    - You add value indexes to relation catalogs to be able to search.  Values
+      can be identified to the catalog with callables or interface elements.
+    
+        - Using interface attributes will cause an attempt to adapt the
+          relation if it does not already provide the interface.
+        
+        - We can use the ``multiple`` argument when defining a value index to
+          indicate that the indexed value is a collection.  This defaults to
+          False.
+        
+        - We can use the ``name`` argument when defining a value index to
+          specify the name to be used in queries, rather than relying on the
+          name of the interface attribute or callable.
 
-- You can set up searches indexes to speed up specific transitive searches.
+    - You can set up search indexes to speed up specific searches, usually
+      transitive.
 
-- We looked at the primary search methods that return objects as opposed to
-  tokens.
-  
-    * `findRelations` returns relations that match the search.
+    - Listeners can be registered in the catalog. They are alerted when a
+      relation is added, modified, or removed; and when the catalog is cleared
+      and copied.
+
+* Catalog Management:
+
+    - Relations are indexed with ``index(relation)``, and removed from the catalog with
+      ``unindex(relation)``. ``index_doc(relation_token, relation)`` and
+      ``unindex_doc(relation_token)`` also work.
+
+    - The ``clear`` method clears the relations in the catalog.
     
-    * `findValues` returns values for the relations that match the search.
+    - The ``copy`` method makes a copy of the current catalog by copying internal
+      data structures, rather than reindexing the relations, which can be a
+      significant optimization opportunity.  This copies value indexes and search
+      indexes; and gives listeners an opportunity to specify what, if anything,
+      should be included in the new copy.
+
+* Searching a catalog:
+
+    - Queries to the relation catalog are formed with dicts.
     
-    * `findRelationChains` returns the transitive paths that match the search.
+    - Query keys are the names of the indexes you want to search, or, for the
+      special case of precise relations, the zc.relation.RELATION constant.
     
-    * `canFind` returns a boolean about whether anything matches the search.
-  
-  We also discussed the fact that users who want to get tokens back from
-  searches can.  We did not give much of an example of this.  The parallel
-  methods are `findRelationTokens`, `findValueTokens`, and
-  `findRelationTokenChains`.
+    - Query values are the tokens of the results you want to match; or None,
+      indicating relations that have None as a value (or an empty collection, if it
+      is a multiple). Search values can use zc.relation.catalog.any(\*args) or
+      zc.relation.catalog.Any(args) to specify multiple (non-None) results to match
+      for a given key.
 
-- Queries are formed with dicts. The keys are the names of the indexes you want
-  to search, or, for the special case of precise relations,
-  zc.relation.RELATION. The values are the tokens of the results you want to
-  match; or None, indicating relations that have None as a value (or no values,
-  if it is a multiple). Search values can use zc.relation.catalog.any or
-  zc.relation.catalog.Any to specify multiple (non-None) results to match for a
-  given key.
+    - The index has a variety of methods to help you work with tokens.
+      ``tokenizeQuery`` is typically the most used, though others are available.
+    
+    - To find relations that match a query, use ``findRelations`` or
+      ``findRelationTokens``.
+    
+    - To find values that match a query, use ``findValues`` or ``findValueTokens``.
+    
+    - You search transitively by using a query factory. The
+      zc.relation.queryfactory.TransposingTransitive is a good common case factory
+      that lets you walk up and down a hierarchy. A query factory can be passed in
+      as an argument to search methods as a ``queryFactory``, or installed as a
+      default behavior using ``addDefaultQueryFactory``.
+    
+    - To find how a query is related, use ``findRelationChains`` or
+      ``findRelationTokenChains``.
+    
+    - To find out if a query is related, use ``canFind``.
+    
+    - Circular transitive relations are handled to prevent infinite loops. They
+      are identified in ``findRelationChains`` and ``findRelationTokenChains`` with
+      a ``zc.relation.interfaces.ICircularRelationPath`` marker interface.
 
-As you can tell by the holes we mentioned in the overview, there's more
-to cover.  Hopefully, this will be enough to get your feet wet, though,
-and maybe start to use the catalog.
+    - search methods share the following arguments:
+    
+      * maxDepth, limiting the transitive depth for searches;
+      
+      * filter, allowing code to filter transitive paths;
+      
+      * targetQuery, allowing a query to filter transitive paths on the basis of
+        the endpoint;
+      
+      * targetFilter, allowing code to filter transitive paths on the basis of the
+        endpoint; and
+    
+      * queryFactory, mentioned above.
+      
+      In addition, the ``ignoreSearchIndex`` argument to ``findRelations``,
+      ``findRelationTokens``, ``findValues``, ``findValueTokens``, and ``canFind``
+      causes the search to ignore search indexes, even if there is an appropriate
+      one.
 
 Next Steps
 ----------
@@ -787,7 +1621,7 @@
     The contract, for nuts and bolts.
 
 Finally, the truly die-hard might also be interested in the timeit
-directory, which holds scripts I ran to test assumptions and learn.
+directory, which holds scripts used to test assumptions and learn.
 
 .. ......... ..
 .. FOOTNOTES ..
@@ -801,7 +1635,7 @@
     True
 
 .. [#legacy] Old instances of zc.relationship indexes, which in the newest
-    version subclass a zc.relationship Catalog, used to have a dict in an
+    version subclass a zc.relation Catalog, used to have a dict in an
     internal data structure.  We specify that here so that the code that
     converts the dict to an OOBTree can have a chance to run.
 
@@ -976,7 +1810,7 @@
        |
      Howie
 
-     to Galyn
+    to Galyn
 
     ::
 

Modified: zc.relation/trunk/src/zc/relation/catalog.py
===================================================================
--- zc.relation/trunk/src/zc/relation/catalog.py	2008-04-22 19:49:08 UTC (rev 85616)
+++ zc.relation/trunk/src/zc/relation/catalog.py	2008-04-22 20:34:37 UTC (rev 85617)
@@ -117,7 +117,12 @@
     _listeners = _queryFactories = _searchIndexes = ()
     _searchIndexMatches = None
 
-    def __init__(self, dumpRel, loadRel, relFamily=None, family=None):
+    def __init__(self, dump, load, btree=None, family=None):
+        # instantiate with instructions on how to dump and load relations,
+        # and optionally the btree module for relations.  ``family`` should
+        # either be BTrees.family32 or BTrees.family64, and will be ignored
+        # if ``btree`` is always specified.  (Otherwise, it will make btree
+        # default to family.IF for relations and value indexes.)
         if family is not None:
             self.family = family
         else:
@@ -126,11 +131,11 @@
         # held mappings are objtoken to (relcount, relset)
         self._EMPTY_name_TO_relcount_relset = family.OO.BTree()
         self._reltoken_name_TO_objtokenset = family.OO.BTree()
-        if relFamily is None:
-            relFamily = family.IF
-        self._relTools = getModuleTools(relFamily)
-        self._relTools['load'] = loadRel
-        self._relTools['dump'] = dumpRel
+        if btree is None:
+            btree = family.IF
+        self._relTools = getModuleTools(btree)
+        self._relTools['load'] = load
+        self._relTools['dump'] = dump
         self._relLength = BTrees.Length.Length()
         self._relTokens = self._relTools['TreeSet']()
         # private; only mutate via indexValue and unindexValue
@@ -200,11 +205,6 @@
         res._attrs = self.family.OO.Bucket(
             [(k, self.family.OO.Bucket(v)) for k, v in self._attrs.items()])
         res._relTools = dict(self._relTools)
-        res._listeners = () # TODO document that listeners are not copied
-        # (rationale: they need to have ``sourceAdded`` called with the
-        # catalog, and we don't know the semantics of any given particular
-        # listener to know if it should be the same object or a copy or nothing
-        # at all. Therefore, we do not include listeners in a copy)
         res._queryFactories = self._queryFactories # it's a tuple
         res._relLength = BTrees.Length.Length()
         res._relLength.set(self._relLength.value)
@@ -212,7 +212,7 @@
             indexes = []
             res._searchIndexMatches = self.family.OO.Bucket()
             for ix, keys in self._searchIndexes:
-                cix = ix.copy(self)
+                cix = ix.copy(res)
                 indexes.append((cix, keys))
                 for key in keys:
                     dest = res._searchIndexMatches.get(key)
@@ -223,6 +223,8 @@
                         if info[3] is ix:
                             dest.append(info[:3] + (cix,))
             res._searchIndexes = tuple(indexes)
+        for l in self._listeners:
+            cl = l.sourceCopied(self, res)
         return res
     
     # Value Indexes
@@ -386,13 +388,20 @@
                 rel_bool = True
             else:
                 rel_bool = False
-            query_names_res = []
+            query_names_res = BTrees.family32.OO.Set() # sorts
             relation_query = False
             for nm in query_names:
                 if nm is RELATION:
                     relation_query = True
                 else:
-                    query_names_res.append(nm)
+                    query_names_res.insert(nm)
+            if getattr(static_values, 'items', None) is not None:
+                static_values = static_values.items()
+            for k, v in static_values:
+                if k is RELATION:
+                    raise ValueError(
+                        'may not register static value for RELATION (None)')
+                query_names_res.insert(k)
             if maxDepth is None:
                 maxDepth = 0
             k = (rel_bool, name, relation_query,
@@ -401,8 +410,6 @@
             if a is None:
                 self._searchIndexMatches[
                     k] = a = persistent.list.PersistentList()
-            if getattr(static_values, 'items', None) is not None:
-                static_values = static_values.items()
             a.append((filter, queryFactory, tuple(static_values), ix))
             keys.add(k)
         keys = frozenset(keys)
@@ -609,7 +616,14 @@
     # Tokenization
     # ============
 
-    def tokenizeQuery(self, query):
+    def tokenizeQuery(self, *args, **kwargs):
+        if args:
+            if kwargs or len(args) > 1:
+                raise TypeError(
+                    'supply a query dictionary or keyword arguments, not both')
+            query = args[0]
+        else:
+            query = kwargs
         res = {}
         for k, v in query.items():
             if k is RELATION:
@@ -626,7 +640,14 @@
             res[k] = v
         return res
 
-    def resolveQuery(self, query):
+    def resolveQuery(self, *args, **kwargs):
+        if args:
+            if kwargs or len(args) > 1:
+                raise TypeError(
+                    'supply a query dictionary or keyword arguments, not both')
+            query = args[0]
+        else:
+            query = kwargs
         res = {}
         for k, v in query.items():
             if k is RELATION:
@@ -738,7 +759,7 @@
                 c_queryFactory != queryFactory):
                 continue
             for k, v in c_static_values:
-                if query[k] != v:
+                if query[k] != v: # we want a precise match here
                     continue
             res = ix.getResults(
                 None, query, maxDepth, filter, queryFactory)
@@ -891,11 +912,11 @@
 
     def findValueTokens(self, name, query=(), maxDepth=None,
                         filter=None, targetQuery=(), targetFilter=None,
-                        queryFactory=None):
+                        queryFactory=None, ignoreSearchIndex=False):
         data = self._attrs.get(name)
         if data is None:
             raise ValueError('name not indexed', name)
-        query = BTrees.family32.OO.Bucket(query)
+        query = BTrees.family32.OO.Bucket(query) # sorts on key
         getQueries = None
         if queryFactory is None:
             queryFactory, getQueries = self._getQueryFactory(
@@ -925,7 +946,7 @@
                 return multiunion(
                     (self._reltoken_name_TO_objtokenset.get((r, name))
                      for r in rels), data)
-        if self._searchIndexMatches is not None:
+        if not ignoreSearchIndex and self._searchIndexMatches is not None:
             if RELATION in query:
                 relation_query = True
                 query_names = tuple(nm for nm in query if nm is not RELATION)
@@ -965,10 +986,10 @@
 
     def findValues(self, name, query=(), maxDepth=None, filter=None,
                    targetQuery=(), targetFilter=None,
-                   queryFactory=None):
+                   queryFactory=None, ignoreSearchIndex=False):
         res = self.findValueTokens(name, query, maxDepth, filter,
                                    targetQuery, targetFilter,
-                                   queryFactory)
+                                   queryFactory, ignoreSearchIndex)
         resolve = self._attrs[name]['load']
         if resolve is None:
             return res
@@ -1002,8 +1023,8 @@
 
     def findRelationTokens(self, query=(), maxDepth=None, filter=None,
                            targetQuery=(), targetFilter=None,
-                           queryFactory=None):
-        query = BTrees.family32.OO.Bucket(query)
+                           queryFactory=None, ignoreSearchIndex=False):
+        query = BTrees.family32.OO.Bucket(query) # sorts on key
         getQueries = None
         if queryFactory is None:
             queryFactory, getQueries = self._getQueryFactory(
@@ -1016,7 +1037,7 @@
             if res is None:
                 res = self._relTools['Set']()
             return res
-        if self._searchIndexMatches is not None:
+        if not ignoreSearchIndex and self._searchIndexMatches is not None:
             if RELATION in query:
                 relation_query = True
                 query_names = tuple(nm for nm in query if nm is not None)
@@ -1042,16 +1063,16 @@
 
     def findRelations(self, query=(), maxDepth=None, filter=None,
                           targetQuery=(), targetFilter=None,
-                          queryFactory=None):
+                          queryFactory=None, ignoreSearchIndex=False):
         return self.resolveRelationTokens(
             self.findRelationTokens(
                 query, maxDepth, filter, targetQuery, targetFilter,
-                queryFactory))
+                queryFactory, ignoreSearchIndex))
 
     def findRelationChains(self, query, maxDepth=None, filter=None,
                                targetQuery=(), targetFilter=None,
                                queryFactory=None):
-        query = BTrees.family32.OO.Bucket(query)
+        query = BTrees.family32.OO.Bucket(query) # sorts on key
         queryFactory, getQueries = self._getQueryFactory(
             query, queryFactory)
         return self._yieldRelationChains(*self._parse(
@@ -1078,7 +1099,7 @@
     def findRelationTokenChains(self, query, maxDepth=None, filter=None,
                                     targetQuery=(), targetFilter=None,
                                     queryFactory=None):
-        query = BTrees.family32.OO.Bucket(query)
+        query = BTrees.family32.OO.Bucket(query) # sorts on key
         queryFactory, getQueries = self._getQueryFactory(
             query, queryFactory)
         return self.yieldRelationTokenChains(*self._parse(
@@ -1087,14 +1108,14 @@
 
     def canFind(self, query, maxDepth=None, filter=None,
                  targetQuery=(), targetFilter=None,
-                 queryFactory=None):
-        query = BTrees.family32.OO.Bucket(query)
+                 queryFactory=None, ignoreSearchIndex=False):
+        query = BTrees.family32.OO.Bucket(query) # sorts on key
         getQueries = None
         if queryFactory is None:
             queryFactory, getQueries = self._getQueryFactory(
                 query, queryFactory)
         targetQuery = BTrees.family32.OO.Bucket(targetQuery)
-        if self._searchIndexMatches is not None:
+        if not ignoreSearchIndex and self._searchIndexMatches is not None:
             if RELATION in query:
                 relation_query = True
                 query_names = tuple(nm for nm in query if nm is not None)

Modified: zc.relation/trunk/src/zc/relation/interfaces.py
===================================================================
--- zc.relation/trunk/src/zc/relation/interfaces.py	2008-04-22 19:49:08 UTC (rev 85616)
+++ zc.relation/trunk/src/zc/relation/interfaces.py	2008-04-22 20:34:37 UTC (rev 85617)
@@ -82,10 +82,16 @@
         """message: you've been removed as a listener from the given catalog.
         """
 
+    def sourceCopied(original, copy):
+        """message: the given original is making a copy.
+        
+        Can install listeners in the copy, if desired.
+        """
+
 class ISearchIndex(IMessageListener):
 
-    def copy(catalog=None):
-        """return a copy of this index, bound to provided catalog if given"""
+    def copy(catalog):
+        """return a copy of this index, bound to provided catalog."""
 
     def setCatalog(catalog):
         """set the search index to be using the given catalog, return matches.
@@ -249,9 +255,9 @@
         
         TODO: explain. :-/"""
 
-    def findValueTokens(name, query=None, maxDepth=None, filter=None,
-                        targetQuery=None, targetFilter=None,
-                        transitiveQueriesFactory=None):
+    def findValueTokens(
+        name, query=None, maxDepth=None, filter=None, targetQuery=None,
+        targetFilter=None, queryFactory=None, ignoreSearchIndex=False):
         """find token results for searchTerms.
         - name is the index name wanted for results.
         - if query is None (or evaluates to boolean False), returns the
@@ -260,24 +266,32 @@
         Otherwise, same arguments as findRelationChains.
         """
 
-    def findValues(name, query=None, maxDepth=None, filter=None,
-                   targetQuery=None, targetFilter=None,
-                   transitiveQueriesFactory=None):
+    def findValues(
+        name, query=None, maxDepth=None, filter=None, targetQuery=None,
+        targetFilter=None, queryFactory=None, ignoreSearchIndex=False):
         """Like findValueTokens, but resolves value tokens"""
 
-    def findRelations(query): # XXX 
+    def findRelations(
+        query=(), maxDepth=None, filter=None, targetQuery=None,
+        targetFilter=None, queryFactory=None, ignoreSearchIndex=False):
         """Given a single dictionary of {indexName: token}, return an iterable
-        of relations that match the query intransitively"""
+        of relations that match the query"""
 
-    def findRelationTokenChains(query, maxDepth=None, filter=None,
-                                    targetQuery=None, targetFilter=None,
-                                    transitiveQueriesFactory=None):
+    def findRelationTokens(
+        query=(), maxDepth=None, filter=None, targetQuery=None,
+        targetFilter=None, queryFactory=None, ignoreSearchIndex=False):
+        """Given a single dictionary of {indexName: token}, return an iterable
+        of relation tokens that match the query"""
+
+    def findRelationTokenChains(
+        query, maxDepth=None, filter=None, targetQuery=None, targetFilter=None,
+        queryFactory=None):
         """find tuples of relation tokens for searchTerms.
         - query is a dictionary of {indexName: token}
         - maxDepth is None or a positive integer that specifies maximum depth
           for transitive results.  None means that the transitiveMap will be
           followed until a cycle is detected.  It is a ValueError to provide a
-          non-None depth if transitiveQueriesFactory is None and
+          non-None depth if queryFactory is None and
           index.defaultTransitiveQueriesFactory is None.
         - filter is a an optional callable providing IFilter that determines
           whether relations will be traversed at all.
@@ -288,17 +302,17 @@
         - targetFilter is an optional callable providing IFilter that
           determines whether a given path will be included in results (it will
           still be traversed)
-        - optional transitiveQueriesFactory takes the place of the index's
-          defaultTransitiveQueriesFactory
+        - optional queryFactory takes the place of the index's
+          matching registered queryFactory, if any.
         """
 
-    def findRelationChains(query, maxDepth=None, filter=None,
-                               targetQuery=None, targetFilter=None,
-                               transitiveQueriesFactory=None):
+    def findRelationChains(
+        query, maxDepth=None, filter=None, targetQuery=None, targetFilter=None,
+        queryFactory=None):
         "Like findRelationTokenChains, but resolves relation tokens"
 
     def canFind(query, maxDepth=None, filter=None, targetQuery=None,
-                 targetFilter=None, transitiveQueriesFactory=None):
+                 targetFilter=None, queryFactory=None, ignoreSearchIndex=False):
         """boolean if there is any result for the given search.
         
         Same arguments as findRelationChains.

Modified: zc.relation/trunk/src/zc/relation/queryfactory.py
===================================================================
--- zc.relation/trunk/src/zc/relation/queryfactory.py	2008-04-22 19:49:08 UTC (rev 85616)
+++ zc.relation/trunk/src/zc/relation/queryfactory.py	2008-04-22 20:34:37 UTC (rev 85617)
@@ -13,10 +13,27 @@
 class TransposingTransitive(persistent.Persistent):
     zope.interface.implements(zc.relation.interfaces.IQueryFactory)
 
-    def __init__(self, name1, name2):
+    def __init__(self, name1, name2, static=()):
         self.names = [name1, name2] # a list so we can use index
+        if getattr(static, 'items', None) is not None:
+            static = static.items()
+        self.static = tuple(sorted(static))
 
     def __call__(self, query, catalog):
+        # check static values, if any.  we want to be permissive here. (as
+        # opposed to finding searchindexes in the catalog's
+        # _getSearchIndexResults method)
+        for k, v in self.static:
+            if k in query:
+                if isinstance(v, zc.relation.catalog.Any):
+                    if isinstance(query[k], zc.relation.catalog.Any):
+                        if query[k].source.issubset(v.source):
+                            continue
+                    elif query[k] in v:
+                        continue
+                elif v == query[k]:
+                    continue
+            return None
         static = []
         name = other = _marker
         for nm, val in query.items():
@@ -50,7 +67,8 @@
 
     def __eq__(self, other):
         return (isinstance(other, self.__class__) and
-                set(self.names) == set(other.names))
+                set(self.names) == set(other.names)
+                and self.static == other.static)
 
     def __ne__(self, other):
         return not self.__eq__(other)

Modified: zc.relation/trunk/src/zc/relation/searchindex.py
===================================================================
--- zc.relation/trunk/src/zc/relation/searchindex.py	2008-04-22 19:49:08 UTC (rev 85616)
+++ zc.relation/trunk/src/zc/relation/searchindex.py	2008-04-22 20:34:37 UTC (rev 85617)
@@ -15,12 +15,26 @@
 
 _marker = object()
 
-class TransposingTransitive(persistent.Persistent):
+class TransposingTransitiveMembership(persistent.Persistent):
     """for searches using zc.relation.queryfactory.TransposingTransitive.
-    
+
     Only indexes one direction.  Only indexes with maxDepth=None.
     Does not support filters.
-    
+
+    This search index's algorithm is intended for transposing transitive
+    searches that look *downward* in a top-down hierarchy. It could be
+    described as indexing transitive membership in a hierarchy--indexing the
+    children of a given node.
+
+    This index can significantly speed transitive membership tests,
+    demonstrating a factor-of-ten speed increase even in a small example.  See
+    timeit/transitive_search_index.py for nitty-gritty details.
+
+    Using it to index the parents in a hierarchy (looking upward) would
+    technically work, but it would result in many writes when a top-level node
+    changed, and would probably not provide enough read advantage to account
+    for the write cost.
+
     This approach could be used for other query factories that only look
     at the final element in the relchain.  If that were desired, I'd abstract
     some of this code.
@@ -29,21 +43,29 @@
     the target filter can look at the last element in the relchain, but not
     at the true relchain itself.  That is: the relchain lies, except for the
     last element.
+    
+    The basic index is for relations.  By providing ``names`` to the
+    initialization, the named value indexes will also be included in the
+    transitive search index.
     """
     zope.interface.implements(zc.relation.interfaces.ISearchIndex)
 
     name = index = catalog = None
 
-    def __init__(self, forward, reverse, names=()):
+    def __init__(self, forward, reverse, names=(), static=()):
         # normalize
         self.names = BTrees.family32.OO.Bucket([(nm, None) for nm in names])
         self.forward = forward
         self.reverse = reverse
         self.update = frozenset((forward, reverse))
         self.factory = zc.relation.queryfactory.TransposingTransitive(
-            forward, reverse)
+            forward, reverse, static)
+        for k, v in self.factory.static:
+            if isinstance(v, zc.relation.catalog.Any):
+                raise NotImplementedError(
+                    '``Any`` static values are not supported at this time')
 
-    def copy(self, catalog=None):
+    def copy(self, catalog):
         new = self.__class__.__new__(self.__class__)
         new.names = BTrees.family32.OO.Bucket()
         for nm, val in self.names.items():
@@ -56,10 +78,9 @@
             new.names[nm] = val
         new.forward = self.forward
         new.reverse = self.reverse
+        new.update = self.update
         new.factory = self.factory
         if self.index is not None:
-            if catalog is None:
-                catalog = self.catalog
             new.catalog = catalog
             new.index = zc.relation.catalog.getMapping(
                 self.catalog.getRelationModuleTools())()
@@ -84,10 +105,12 @@
             if token not in self.index:
                 self._index(token)
         # name, query_names, static_values, maxDepth, filter, queryFactory
-        res = [(None, (self.forward,), (), None, None, self.factory)]
+        res = [(None, (self.forward,), self.factory.static, None, None,
+                self.factory)]
         for nm in self.names:
             res.append(
-                (nm, (self.forward,), (), None, None, self.factory))
+                (nm, (self.forward,), self.factory.static, None, None,
+                 self.factory))
         return res
 
     def _index(self, token, removals=None, remove=False):
@@ -95,7 +118,8 @@
         if removals and self.forward in removals:
             starts.update(t for t in removals[self.forward] if t is not None)
         tokens = set()
-        reverseQuery = BTrees.family32.OO.Bucket(((self.reverse, None),))
+        reverseQuery = BTrees.family32.OO.Bucket(
+            ((self.reverse, None),) + self.factory.static)
         for token in starts:
             getQueries = self.factory(dict(reverseQuery), self.catalog)
             tokens.update(chain[-1] for chain in
@@ -114,7 +138,8 @@
         # now we go back and try to fill them back in again.  If there had
         # been a cycle, we can see now that we have to work down.
         relTools = self.catalog.getRelationModuleTools()
-        query = BTrees.family32.OO.Bucket(((self.forward, None),))
+        query = BTrees.family32.OO.Bucket(
+            ((self.forward, None),) + self.factory.static)
         getQueries = self.factory(query, self.catalog)
         for token in tokens:
             if token in self.index: # must have filled it in during a cycle
@@ -232,6 +257,7 @@
     Could be used for transitive searches, but writes would be much more
     expensive than the TransposingTransitive approach.
     
+    see tokens.txt for an example.
     """
     # XXX Rename to Direct?
     zope.interface.implements(
@@ -259,12 +285,9 @@
             depths = (1,)
         self.depths = tuple(depths)
 
-
-    def copy(self, catalog=None):
+    def copy(self, catalog):
         res = self.__class__.__new__(self.__class__)
         if self.index is not None:
-            if catalog is None:
-                catalog = self.catalog
             res.catalog = catalog
             res.index = BTrees.family32.OO.BTree()
             for k, v in self.index.items():
@@ -362,6 +385,9 @@
             self.setCatalog(None)
             self.setCatalog(catalog)
 
+    def sourceCopied(self, original, copy):
+        pass
+
     def getQueries(self, token, catalog, additions, removals, removed):
         source = {}
         for name in self.names:

Modified: zc.relation/trunk/src/zc/relation/searchindex.txt
===================================================================
--- zc.relation/trunk/src/zc/relation/searchindex.txt	2008-04-22 19:49:08 UTC (rev 85616)
+++ zc.relation/trunk/src/zc/relation/searchindex.txt	2008-04-22 20:34:37 UTC (rev 85617)
@@ -191,7 +191,7 @@
 
     >>> import zc.relation.searchindex
     >>> catalog.addSearchIndex(
-    ...     zc.relation.searchindex.TransposingTransitive(
+    ...     zc.relation.searchindex.TransposingTransitiveMembership(
     ...         'token', 'children', names=('children',)))
 
 Now we should have a search index installed.

Modified: zc.relation/trunk/src/zc/relation/tokens.txt
===================================================================
--- zc.relation/trunk/src/zc/relation/tokens.txt	2008-04-22 19:49:08 UTC (rev 85616)
+++ zc.relation/trunk/src/zc/relation/tokens.txt	2008-04-22 20:34:37 UTC (rev 85617)
@@ -159,17 +159,17 @@
 
     >>> import zc.relation.searchindex
     >>> catalog.addSearchIndex(
-    ...     zc.relation.searchindex.TransposingTransitive(
+    ...     zc.relation.searchindex.TransposingTransitiveMembership(
     ...         'part', None))
 
 ...and down.
 
     >>> catalog.addSearchIndex(
-    ...     zc.relation.searchindex.TransposingTransitive(
+    ...     zc.relation.searchindex.TransposingTransitiveMembership(
     ...         None, 'part'))
 
-(These search indexes are a bit gratuitous, especially looking up, but they
-are examples [#verifyObjectTransitive]_.)
+PLEASE NOTE: the search index looking up is not a good idea practically.  The
+index is designed for looking down [#verifyObjectTransitive]_.
 
 Let's create and add a few organizations.
 
@@ -1040,8 +1040,8 @@
     >>> import transaction
     >>> transaction.commit()
 
-.. [#verifyObjectTransitive] The TransposingTransitive indexes provide
-    ISearchIndex.
+.. [#verifyObjectTransitive] The TransposingTransitiveMembership indexes
+    provide ISearchIndex.
     
     >>> from zope.interface.verify import verifyObject
     >>> import zc.relation.interfaces



More information about the Checkins mailing list