Sunday, November 04, 2007

The Poor State of SPARQL Implementations

*Sigh* I had a simple task. Really I did. I am putting the final touches on a journal article and wanted to expand an example to be more interesting. All I wanted to do was demonstrate (in SPARQL) that multiple RDF graphs can be pulled in from URLs and the dynamically-assembled graph queried. I wouldn't have thought that was such a big ask for 2007. Alas, I was wrong.

Here is the query:


prefix sec: <http://www.itee.uq.edu.au/~dwood/ontologies/sec.owl#>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?class ?test ?testresults
FROM <http://www.itee.uq.edu.au/~dwood/ontologies/sec-example.owl>
FROM <http://www.itee.uq.edu.au/~dwood/ontologies/sec-testresults.owl>
WHERE {
?class rdf:type sec:OOClass .
?test sec:isTestOf ?class .
?test sec:hasTestResults ?testresults
}


Redland won't do it because it does not support FROM (or FROM NAMED). The same for OpenLink Virtuoso SPARQL and JRDF. Sesame 2.0 might do it, but I got tired of looking. I'll have to get back to it tomorrow.

In the meantime, I hacked around the problem by using a little-known feature of JRDF - one can import a series of RDF or OWL files and query the subsequent graph. It is annoying, and requires local copies of the documents, but it works (kind of).

The really sad thing is that Tucana had this feature (in the iTQL query language) in 2000 or 2001. Mulgara still does, of course. Paul assures me that SPARQL support in Mulgara is finally close. That is wonderful, but it does make me feel a bit guilty for not contributing to it given its obvious need.

I still (since 2000) think that querying multiple data sources from the WEB makes the SEMANTIC WEB a bit more useful, and interesting. *Sigh* I guess I will have to either contribute more or live with it.

UPDATE: Sesame does not support SPARQL datasets according to this bug, even though a patch has apparently already been contributed.

UPDATE: OpenLink Virtuoso demos at http://demo.openlinksw.com/sparql and http://demo.openlinksw.com/isparql now both return results. However, they return four results where I expect two.

Dave Beckett claims that the latest Redland/Rasql from svn now supports the query, but that he also gets four results.

Danny's SPARQLer now returns correct results (two).

Thanks to everyone who responded! Having proper FROM and FROM NAMED support opens a floodgate of potential new SemWeb applications.

UPDATE: Changing "SELECT" to "SELECT DISTINCT" returns the correct two results from Virtuoso. I suspect that change may be needed with others, too.

11 comments:

  1. It would work easily in AllegroGraph if your OWL ontologies weren't served up as N-Triples (you're using text/plain as the content-type, which is incorrect). You can optionally configure AG to load required graphs in response to SPARQL Protocol requests.

    Having manually imported those files (AllegroGraph could do it automatically if you used the correct content-type!), it runs your query in 3ms on my personal server.

    This stuff ain't hard. I don't know why more implementations don't do it.

    ReplyDelete
  2. Thanks, Rich. Unfortunately, I don't have control over the university's server and so can't change the Content-Type. I can host them elsewhere, though, so I'll try it.

    ReplyDelete
  3. Hi There,

    I just tested your query using our SPARQL endpoints at:

    http://demo.openlinksw.com/sparql and http://demo.openlinksw.com/isparql (our SPARQL Query Builder).

    The Query works fine via /isparql but actually returns an empty result via the /sparql endpoint (i.e. the proper generic access endpoint).

    I suspect a cache related bug, so please do not write off Virtuoso and this is a very small thing :-)

    ReplyDelete
  4. Thanks! The iSPARQL endpoint does, in fact, work. Any idea why the results are duplicated?

    ReplyDelete
  5. This query worked fine with Redland/Rasqal cut and paste exactly from your comment (using the latest Rasqal SVN). I don't know where you were trying it. There are 4 results.

    ReplyDelete
  6. Appears to work ok on in the ARQ online demo (I've been using mostly ARQ/Jena recently, not noticed anything missing).

    http://www.sparql.org/sparql.html

    ReplyDelete
  7. Hi!

    I just verified this query (cut and paste) against demo.openlinksw.com/sparql and I got 4 results as I believe should be expected.

    Best Regards,

    Yrjänä Rankka
    Developer, Virtuoso Team
    OpenLink Software

    ReplyDelete
  8. Hi There Again,

    The instance on http://demo.openlinksw.com/sparql has been updated, hence the response from Yrjana re. proper results :-)

    We now have /isparql and /sparql behaving properly.

    Kingsley

    ReplyDelete
  9. Thanks to all who responded! I am thrilled to see this problem addressed.

    Yrjänä and Kingsley: http://demo.openlinksw.com/sparql does indeed now return four results, just as http://demo.openlinksw.com/isparql does. Thanks very much for the update. However, I still do not see why there should be four results. I expect two.

    Dave B: Thanks for your update. I did not try the latest Redland/Rasql from svn, but will now do so. Again, though, I do not understand why you would get four results.

    Danny: http://www.sparql.org/sparql.html returns two results! You win :) I did try SPARQLer, so I am not sure if you updated it, but it does work and returns what I think are the correct results. Thanks again.

    ReplyDelete
  10. Please use DISTINCT in your query. Our SPARQL processor is cognizant of the potential for extremely large graphs in the FROM and the implications of performing UNIONs etc..

    ReplyDelete
  11. Ah! Changing "select" to "select distinct" does return two results. Thanks.

    ReplyDelete