[Zope3-dev] Re: RDFLib and Zope 3

Tue Aug 30 11:46:21 EDT 2005

On Mon, 2005-08-29 at 23:24 -0400, Gary Poster wrote:

> >> Right.  Well in this case we would provide just a very simple  
> >> interface
> >> facade that had no effect when run in an environment with no
> >> zope.interface (ie, catch the ImportError, null-out the facade) or  
> >> hook
> >> into zope.interface if it is available.  This way rdflib can be  
> >> still be
> >> used with or without zope.interface.
> >
> > Sounds good.
> 
> OK, cool.

Stub interfaces (from Zemantic) are now checked into the rdflib trunk.
Dan and I also had a discussion on what changed to the lib would be made
if we actually depended on zope.interface.  Definately the plugin
framework would go, but there were some other unanswered questions we
had.

> > Yep, feel free to stop by anytime.
> 
> OK, cool, I plan to again. :-)

Please do, we need some help sketching out what kinds of changes would
be required moving over to a component model.

> I'm interested in contemplating RDF as a full catalog solution for  
> Zope, at least as a thought experiment.  The SPARQL work seems  
> interesting, in regards to some of the recent discussion on the Zope  
> 3 list; and the ability to seamlessly and simultaneously query both  
> relationship information between objects and more traditional catalog  
> data seems compelling.

And I think SPARQL is up to the task to query a catalog for the most
part, there might be some holes in the language that don't suite the
catalog exactly (text index querying, for example, is out of spec), but
for field and keyword indexes SPARQL would make a great query language,
one would just need to teach the catalog to assign well know predicates
to index names, possibly via adaptation.  For example, I think the
following queries and many like it are easily possible against a
catalog:

PREFIX  dc: <http://purl.org/dc/elements/1.1/>
SELECT  ?title
WHERE
    { ?book dc:title ?title }

Note that the use of bound variables also removes the need for brains.
Given that SPARQL also has CONSTRUCT (to build a graph of results) and
DESCRIBE (generate results describe one or more resources) clauses, I
think it makes a better object query language than OQL.

Note that the sparql support needs help!  We have query logic and half a
parser, the parser needs completion and to be glued to the logic.

> 
> It seems to me that allowing a back end to index particular  
> relationships--and joins, particularly through blank nodes--would be  
> a big step in letting RDF actually be a front end to a full catalog.   
> Another step would be having a front end that honored rdfs range and  
> domain constraints.
> 
> I plan to get on IRC and bother you all again as soon as I have time  
> to do so. :-)

Sure, there are some notes from me on the z3lab site that might be of
interest for thinking about zemantic and catalog integration.  THis may
be completely the wrong direction, bust most of my experiments so far
have been to keep a strong separation between how zemantic stores
searchable data (without interpretation) and how the catalog stores it
(with interpretation).  

For example, rdflib doesn't interpret a dc:modification_data value as a
date at all.  It's just one element in a graph that rdflib stores, not
interprets.  We (meaning Dan and I, as we've discussed this) want to
keep as strong barrier as possible between a Graph and its
interpretation, ie, we don't want to encapsulate or imply any
interpretation on any graphs, and if we do, we want to make sure that
interpretation is pluggable and that a graph can be transposed between
interpretations easily.  Interpreting dc:modification_date as a date
certainly does make sense, but its an assumption we can't make globally,
although I'm not against sensible defaults.

The catalog, on the other hand, makes these assumptions by design.  It
knows dc:modification_date is a date because the user instructs that
value to go into a date range searchable index.  This is not bad, it's
good, I think that rdflib and a catalog can leverage the distinction
between uninterpreted and interpreted data.

So far most of my thoughts have been that content (or whatever)
describes itself as RDF.  This is stored in a Graph with no
interpretation, the rdflib object only holds the "shape" of the graph.
Events get fired that make decisions on how that graph should be
interpreted, and the data is then cataloged.  I see a one to many
relationship possible, one interpreting catalog can derive its
interpretations on many graphs, and one graph can have many interpreting
catalogs, all queriable via SPARQL.

-Michel