Parsing SPARQL queries

2020-02-29 01:03发布

I need to test for a certain structural property of a couple million SPARQL queries, and for that I need the structure of the WHERE statement. I'm currently trying to use fyzz to do this, but unfortunately its documentation is not very useful. Parsing queries is easy, the problem is that i haven't been able to recover the structure of the statement. For example:

>>> from fyzz import parse
>>> a=parse("SELECT * WHERE {?x a ?y . {?x a ?z}}")
>>> b=parse("SELECT * WHERE {?x a ?y OPTIONAL {?x a ?z}}")
>>> a.where==b.where
True
>>> a.where
[(SparqlVar('x'), ('', 'a'), SparqlVar('y')), (SparqlVar('x'), ('', 'a'), SparqlVar('y'))]

Is there a way to recover the actual parse tree in fyzz instead of just the triples, or some other tool which would let me do this? RDFLib seems to have had a bison SPARQL parser in the past, but I can't find it in the rdflib or rdfextras.sparql packages.

Thanks

2条回答
可以哭但决不认输i
2楼-- · 2020-02-29 01:54

Another tool is roqet a tool that is packaged within rasqal. It is a command line tool that returns the parsed tree. For instance:

roqet -i laqrs -d structure -n -e "SELECT * WHERE {?x a ?y OPTIONAL {?x a ?z}}"

would output ..

Query:
query verb: SELECT
query bound variables (3): x, y, z
query Group graph pattern[0] {
  sub-graph patterns (2) {
    Basic graph pattern[1] #0 {
      triples {
        triple #0 { triple(variable(x), uri<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>, variable(y)) }
      }
    }
    Optional graph pattern[2] #1 {
      sub-graph patterns (1) {
        Basic graph pattern[3] #0 {
          triples {
            triple #0 { triple(variable(x), uri<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>, variable(z)) }
          }
        }
      }
    }
  }
}

Looking at your comment in the other answer I don't think this is what yo need. And I don't think you will find an answer looking inside SPARQL parsers. The object (or triple pattern) evaluation in a query happens inside Query Engines that, in well designed systems, is isolated from query parsing.

For instance, in 4store you could look at the 4s-query command with the option -vvv (very verbose) where you would see an output of how the query was executed and how substitutions were performed for each triple pattern evaluation.

查看更多
叛逆
3楼-- · 2020-02-29 01:54

ANTLR has a SPARQL grammar here: http://www.antlr.org/grammar/1200929755392/index.html

ANTLR can generate parsing code for Python to run.

查看更多
登录 后发表回答