[Zope3-dev] Re: RDFLib and Zope 3

Wed Aug 31 10:07:59 EDT 2005

On Aug 30, 2005, at 11:46 AM, Michel Pelletier wrote:

> On Mon, 2005-08-29 at 23:24 -0400, Gary Poster wrote:
>
>
>>>> Right.  Well in this case we would provide just a very simple
>>>> interface
>>>> facade that had no effect when run in an environment with no
>>>> zope.interface (ie, catch the ImportError, null-out the facade) or
>>>> hook
>>>> into zope.interface if it is available.  This way rdflib can be
>>>> still be
>>>> used with or without zope.interface.
>>>>
>>>
>>> Sounds good.
>>>
>>
>> OK, cool.
>>
>
> Stub interfaces (from Zemantic) are now checked into the rdflib trunk.

Awesome.

> Dan and I also had a discussion on what changed to the lib would be  
> made
> if we actually depended on zope.interface.

That sounds cool too.

> Definately the plugin
> framework would go, but there were some other unanswered questions we
> had.
>
>>> Yep, feel free to stop by anytime.
>>>
>>
>> OK, cool, I plan to again. :-)
>>
>
> Please do, we need some help sketching out what kinds of changes would
> be required moving over to a component model.

OK, I'm a bit swamped this week, but I really want to come by.  Maybe  
I'll try to sneak in a bit of time today.

>> I'm interested in contemplating RDF as a full catalog solution for
>> Zope, at least as a thought experiment.  The SPARQL work seems
>> interesting, in regards to some of the recent discussion on the Zope
>> 3 list; and the ability to seamlessly and simultaneously query both
>> relationship information between objects and more traditional catalog
>> data seems compelling.
>>
>
> And I think SPARQL is up to the task to query a catalog for the most
> part, there might be some holes in the language that don't suite the
> catalog exactly (text index querying, for example, is out of spec),

Actually, with extended value testing, I think text index querying  
would work reasonably well.

http://www.w3.org/TR/2004/WD-rdf-sparql-query-20041012/ 
#TestingExtensions

For instance, this might work (where "..." indicates elements of a  
triple or whatever all else).

PREFIX z:<http://zope.org/foo/bar>
SELECT ?content
WHERE
     ( ... ... ... )
AND
     &z:containsText(?content, '"raining cats" and not dogs')

Or do I misunderstand something?

Interestingly, according to the EBNF, this is actually valid, and the  
only way to use an extended test by itself:

PREFIX z:<http://zope.org/foo/bar>
SELECT ?content
WHERE
AND
     &z:containsText(?content, '"raining cats" and not dogs')

Um.  Kinda odd. :-)

The nascent quality of SPARQL does give me pause sometimes as I read  
the unfinished spec and see oddities like this.  Admittedly this use  
is on the very edge of the SPARQL intent; moreover, rough edges are  
to be expected at this stage.  The unfinished spec makes me wonder if  
its too early to be exploring with expectations of any nearish-term  
win, though.  Not a judgement, just a shared thought.

> but
> for field and keyword indexes SPARQL would make a great query  
> language,
> one would just need to teach the catalog to assign well know  
> predicates
> to index names, possibly via adaptation.  For example, I think the
> following queries and many like it are easily possible against a
> catalog:
>
> PREFIX  dc: <http://purl.org/dc/elements/1.1/>
> SELECT  ?title
> WHERE
>     { ?book dc:title ?title }
>
> Note that the use of bound variables also removes the need for brains.

We actually don't have catalog brains in Zope 3 anyway, but yes,  
maybe it would let you only wake up objects when you need to, which  
is what you are getting at.  It might make sorting easier too.  Need  
to think this through.

> Given that SPARQL also has CONSTRUCT (to build a graph of results) and
> DESCRIBE (generate results describe one or more resources) clauses, I
> think it makes a better object query language than OQL.
>
> Note that the sparql support needs help!  We have query logic and  
> half a
> parser, the parser needs completion and to be glued to the logic.

Yes, I got that.  :-)

>> It seems to me that allowing a back end to index particular
>> relationships--and joins, particularly through blank nodes--would be
>> a big step in letting RDF actually be a front end to a full catalog.
>> Another step would be having a front end that honored rdfs range and
>> domain constraints.
>>
>> I plan to get on IRC and bother you all again as soon as I have time
>> to do so. :-)
>
> Sure, there are some notes from me on the z3lab site that might be of
> interest for thinking about zemantic and catalog integration.

Could you send a URL?

> THis may
> be completely the wrong direction, bust most of my experiments so far
> have been to keep a strong separation between how zemantic stores
> searchable data (without interpretation) and how the catalog stores it
> (with interpretation).
>
> For example, rdflib doesn't interpret a dc:modification_data value  
> as a
> date at all.  It's just one element in a graph that rdflib stores, not
> interprets.  We (meaning Dan and I, as we've discussed this) want to
> keep as strong barrier as possible between a Graph and its
> interpretation, ie, we don't want to encapsulate or imply any
> interpretation on any graphs, and if we do, we want to make sure that
> interpretation is pluggable and that a graph can be transposed between
> interpretations easily.  Interpreting dc:modification_date as a date
> certainly does make sense, but its an assumption we can't make  
> globally,
> although I'm not against sensible defaults.

AFAICT you can make constraints on the range and domain of a  
predicate.  Making this enforceable would be important to a BTree- 
based back end that actually stored literals as literals (not in the  
xsd form): they need to sort sanely, which usually means being of the  
same type.

> The catalog, on the other hand, makes these assumptions by design.  It
> knows dc:modification_date is a date because the user instructs that
> value to go into a date range searchable index.  This is not bad, it's
> good, I think that rdflib and a catalog can leverage the distinction
> between uninterpreted and interpreted data.

I think the RDF spec can be used for "interpretation" too, given the  
rich spellings that RDF allows for predicates, and the typing of its  
nodes.  I do think that the RDF component should only deal with RDF,  
so I agree with your general desire.  But RDF is very, very rich: a  
*lot* of functionality could be in a pure RDF library like RDFlib,  
including support for predicate constraints and join indexes for  
instance.  That would be useful whatever your RDF use, if you needed  
efficiency for common searches.

The extra RDF features could also be a layer on top of a simpler  
graph implementation, if that were a desired design.  I'm not saying  
that RDFlib should have all the features, but that a pure-RDF library  
*could* have all the features.

> So far most of my thoughts have been that content (or whatever)
> describes itself as RDF.  This is stored in a Graph with no
> interpretation, the rdflib object only holds the "shape" of the graph.
> Events get fired that make decisions on how that graph should be
> interpreted, and the data is then cataloged.  I see a one to many
> relationship possible, one interpreting catalog can derive its
> interpretations on many graphs, and one graph can have many  
> interpreting
> catalogs, all queriable via SPARQL.

I don't have a strong opinion about this atm.

Gary