[Zope3-dev] Yet Another Relations (aka Reference) Engine...
Jean-Marc Orliaguet
jmo at ita.chalmers.se
Fri Nov 11 12:00:57 EST 2005
Helmut Merz wrote:
>Am Freitag, 11. November 2005 16:11 schrieb Jean-Marc Orliaguet:
>
>
>
>>Hi Helmut!
>>
>>
>
>Hi Jean-Marc,
>
>thanks for your remarks,
>
>just before going into more detail: My primary concern was the
>API - it would really fine if there could be a simple (as simple
>as possible but not simpler) standard set of (low-level)
>interfaces on which to build (defining semantically richer
>interfaces) and for which to provide implementations (depending
>on the needs of the application).
>
>The implementation with the catalog should serve as a (again
>fairly simple but working) example and a proof of concept; I
>think I would be just lucky if it would show up as really
>useful ;-) (but maybe...)
>
>
>
Hi,
a common interface could be useful indeed.
>>I can tell you about the design decisions made in the case of
>>the relation tool included in CPSSkins. They don't necessarily
>>appear in the code itself in an obvious way.
>>
>>1) separate storage from storage policy
>>
>>the relation storage stores what it is told to store, as long
>>as the objects are Relatable they can be stored. The storage
>>policy (using unique ids or not, etc..) is the responsibility
>>of the application itself. To impose a unique id policy when
>>storing elements would be a mistake in my opinion (in the case
>>of cpsskins it wouln't work either).
>>
>>
>
>The only prerequisite for using the IntIds utility is that the
>objects are persistent (provide IPersistent). If one wants to
>relate objects that are not persistent or have relations that
>for some reason can't be persistent you can't use the catalog
>approach because the catalog depends on IntIds.
>
>So the catalog-based implementation won't be usable for relations
>between in-memory objects (like views, adapters and related
>stuff), that's true.
>
>
I was thinking more about the policy of assigning unique ids to objects
in a relation. It's the application that really should decide about that
policy.
>>2) keep the relation storage index as small as possible.
>>
>>Do not index predicates, the same predicates are used in too
>>many relations, the size of the index ould just increase
>>dramatically. Instead only index the elements that are inside
>>the relation, the chances that the same elements are related
>>in many different ways are very low.
>>
>> cf.
>>
>>
>>
>http://www.z3lab.org/sections/blogs/jean-marc-orliaguet/2005_08_27_triadic-relations/
>
>
>>http://svn.nuxeo.org/trac/pub/file/z3lab/cpsskins/branches/jmo
>>-perspectives/storage/relations.py
>>
>>
>
>I read this, and it indeed gave me the impression that it might
>be a not so bad idea to use a catalog ;-)
>
>
>
well, you haven't written the catalog indexes yet :-)
And Lennart wrote a piece about the kinds of problems you'll run into if
you don't optimize them for relations. You'll end up with intersections
of huge sets:
http://blogs.nuxeo.com/sections/blogs/lennart_regebro/2005_08_29_indexing-events
>> I don't know about using the zc.catalog for indexing
>>relations, you could end up in huge indexes and very slow
>>queries.
>>
>>
>
>This is one of my concerns, too, but I'm fairly optimistic: the
>catalog indexes store a common string to be indexed only once,
>so having identical ; I'm working with the Archetypes reference
>engine (that uses - at least in this respect - the same kind of
>catalog indexes) in situations with many thousands of objects
>and didn't get problems of this kind.
>
>
>
By looking at the code, Archetypes does not store relations, it stores
'references' (and backward references) which consist in a target object
and a predicate ('relationship') in the objects themselves . I guess
that objects are indexed in the catalog. So the relation is stored
implicitly but there is no explicit relation object to start with. So
the model is a bit different I guess.
>>3) don't make the API for querying the storage be too
>>intelligent,
>>
>>
>
>The query() method using the catalog's searchResults() / apply()
>methods was the dumbest one I could thing of ;-)
>
>
>
>> to create complex queries, create complex predicates instead,
>>i.e.
>>
>> - predicates that combine several predicates
>> - proxy predicates (when the predicate is evaluated at
>>runtime and a method is specifed instead)
>>
>> cf
>>
>>
>>
>http://svn.nuxeo.org/trac/pub/file/z3lab/cpsskins/branches/jmo-perspectives/relations/__init__.py
>
>
>> if you need to do really complex queries, do several
>>queries and filter out the results afterwards in you
>>application unless you're fine with ending up with a huge
>>catalog index.
>>
>>
>
>To be honest, I never thought about complex queries as I just
>want to find e.g. the subtasks of a task and the resources
>allocated to it - maybe my use case is just somewhat simple.
>
>OTOH: An advantage of using a catalog are the - as I think -
>fairly efficient set operations on search results for the
>indexes...
>
>Helmut
>
>
This opens the door to a combinational explosion. The number of
relations between objects literally explodes unless you carefully choose
the relation predicates. The catalog won't help unless you have very
carefully designed indexes I guess.
/JM
More information about the Zope3-dev
mailing list