I need to test for a certain structural property of a couple million SPARQL queries, and for that I need the structure of the WHERE
statement. I'm currently trying to use fyzz to do this, but unfortunately its documentation is not very useful. Parsing queries is easy, the problem is that i haven't been able to recover the structure of the statement. For example:
>>> from fyzz import parse
>>> a=parse("SELECT * WHERE {?x a ?y . {?x a ?z}}")
>>> b=parse("SELECT * WHERE {?x a ?y OPTIONAL {?x a ?z}}")
>>> a.where==b.where
True
>>> a.where
[(SparqlVar('x'), ('', 'a'), SparqlVar('y')), (SparqlVar('x'), ('', 'a'), SparqlVar('y'))]
Is there a way to recover the actual parse tree in fyzz instead of just the triples, or some other tool which would let me do this? RDFLib seems to have had a bison SPARQL parser in the past, but I can't find it in the rdflib
or rdfextras.sparql
packages.
Thanks
Another tool is
roqet
a tool that is packaged within rasqal. It is a command line tool that returns the parsed tree. For instance:roqet -i laqrs -d structure -n -e "SELECT * WHERE {?x a ?y OPTIONAL {?x a ?z}}"
would output ..
Looking at your comment in the other answer I don't think this is what yo need. And I don't think you will find an answer looking inside SPARQL parsers. The object (or triple pattern) evaluation in a query happens inside
Query Engines
that, in well designed systems, is isolated from query parsing.For instance, in 4store you could look at the
4s-query
command with the option-vvv
(very verbose) where you would see an output of how the query was executed and how substitutions were performed for each triple pattern evaluation.ANTLR has a SPARQL grammar here: http://www.antlr.org/grammar/1200929755392/index.html
ANTLR can generate parsing code for Python to run.