SPARQL equivalent of a SQL select query?

2019-09-26 10:13发布

问题:

If I have a SQL table that can be queried like this:

Select empname from mytable
Where empid=1;

What would an equivalent SPARQL query be?

回答1:

As a few people have mentioned here and on your similar, previous question, you probably need to do some more reading about the nature of RDF-modeled data.

For Stack Overflow, you definitely need to get into the habit of showing your input data and whatever code you have written so far, even if it isn't working perfectly. In that case, you need to include the error message in your question.

Here's a self contained example in which I

  • guessed what your table looks like
  • modeled it as RDF
  • wrote a SPARQL query that does what you asked

The example is written in R, my strongest language. To be clear: you don't need to know anything about R to be good at RDF, SPARQL and other semantic technologies. If I were better at Java or Python, I would have written the example in one of those languages.

The relational (SQL) data must be converted to RDF before it can be queried with RDF. In the code below, I have done that by hand. See the end of the message for the nuts and bolts.

library(rrdf)
library(sqldf)

mytable.table.string <- 'empid,empname
1,Arthur
2,Kevin
3,Joey'

mytable <- read.csv(textConnection(mytable.table.string))

print(mytable)

>  empid empname
> 1     1  Arthur
> 2     2  Kevin
> 3     3  Joey

sqldf('select empname from mytable where empid = 1')

>   empname
> 1  Arthur

# R2RML conversion goes here
# I did it by hand

table.rdf.string <- 'prefix mytable: <http://mytable/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
mytable:1 a mytable:employee .
mytable:1 mytable:empid "1" .
mytable:1 rdfs:label "Arthur" .
mytable:2 a mytable:employee .
mytable:2 mytable:empid "2" .
mytable:2 rdfs:label "Kevin" .
mytable:3 a mytable:employee .
mytable:3 mytable:empid "3" .
mytable:3 rdfs:label "Joey" . '

mytable.rdf <- fromString.rdf(table.rdf.string, format = "TURTLE")

query.as.sparql <- 'prefix mytable: <http://mytable/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?empname
where
{?emp a mytable:employee .
?emp mytable:empid "1" .
?emp rdfs:label ?empname . }'

sparql.rdf(model = mytable.rdf, sparql = query.as.sparql)

>     empname 
> [1,] "Arthur"

To review: this R code above demonstrates a SAPRQL query that is equivalent to the OP's SQL query. SQL queries are typically run against SQL databases, but I made this self-contained by running it against a dataframe.

SPARQL queries are run against RDF data, frequently in the form of a triplestore database. In my example, a string containing RDF triples is converted into an in-memory RDF model, and the SPARQL query is sent to that model.

So the question becomes, if you have data in a SQL database, how do you query the database's contents with SPARQL? As of Summer, 2017, this is an evolving topic. There are a variety of commercial and open-source tools that can help solve this problem, usually in one of two ways:

  1. by creating an interface that appears to send the SPARQL query directly to the SQL database, in real time
  2. by transforming and dumping the SQL database contents as a static RDF file, which can then be loaded into a naively SPARQL-compliant triplestore

In either case, somebody will need to create a mapping from the SQL data into RDF format. This is required because SQL tables don't intrinsically have explicit semantics. Rows typically model individuals of some category, and columns typically represent properties or relationships of the individuals. RDF triples must have explicit subject-verb-object semantics.

Here are my recommended first steps for SQL-RDF mapping with Karma, which has a nice graphical interface. Similar results could be achieved with another R2RML mapper like D2RQ (works, but un-maintained for 3 years), ontop, or the R2RML parser.

  1. Watch the Karma videos and read the user's guide, especially steps 1 though 5 plus 7.
  2. Install Karma according to the posted directions.
  3. Launch Karma. A web interface to Karma should appear in your web browser
  4. Paste the ontology below into a text editor and save as 44403425.owl Open the attached ontology in Karma (import -> from file)
  5. Likewise, save & import mytable.csv, which is intended to represent a dump from a SQL database. If you want to work with a live database, you will be responsible for making sure the right database drivers are installed. In that case, use import -> Database Table instead.
  6. Again, save the model as mytable-model.ttl. Click on the pull-down on the left of the gray bar near the top of the Karma web page. Choose Apply R2RML model -> from file and load mytable_model.ttl
  7. Click on the pull-down again and select Publish -> RDF. Accept the suggested settings.
  8. Look for OpenRDF at the right of the top black bar. Right click that link to open it in a new browser window/tab.
  9. On the OpenRDF Workbench page, select the karma_data repository. You can now use any of the Explore links on the left, including the (SPARQL) Query link
  10. The OpenRDF Workbench page will become inaccessible when you close Karma. You can export your new triples from the OpenRDF Workbench, or you can do it from the gray bar in Karma. You can also export the modelling file, if you have made any changes to it. At that point, the RDF data triples can be loaded into any other triplestore, or explored with command line tools like ARQ from Jena.

Finally, if you want to use an R2RML mapper other than Karma, you can use a mapping file similar to (but not identical to) mytable-model.ttl. Karma does some data manipulations with snippets of Python, which get embedded in JSON. That JSON becomes the huge object of a km-dev:hasWorksheetHistory triple. I don't believe that all R2RML parsers would recognize the Python/JSON.

44403425.owl

<?xml version="1.0"?>
<rdf:RDF xmlns="https://stackoverflow.com/questions/44403425.owl/"
     xml:base="https://stackoverflow.com/questions/44403425.owl/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:mytable="https://stackoverflow.com/questions/44403425.owl/"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:xml="http://www.w3.org/XML/1998/namespace"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
    <owl:Ontology rdf:about="https://stackoverflow.com/questions/44403425.owl"/>

    <owl:DatatypeProperty rdf:about="https://stackoverflow.com/questions/44403425.owl/empid">
        <rdfs:domain rdf:resource="https://stackoverflow.com/questions/44403425.owl/employee"/>
        <rdfs:range rdf:resource="http://www.w3.org/2000/01/rdf-schema#Literal"/>
        <rdfs:label>employee identifier</rdfs:label>
    </owl:DatatypeProperty>

    <owl:Class rdf:about="https://stackoverflow.com/questions/44403425.owl/employee">
        <owl:equivalentClass>
            <owl:Restriction>
                <owl:onProperty rdf:resource="https://stackoverflow.com/questions/44403425.owl/empid"/>
                <owl:cardinality rdf:datatype="http://www.w3.org/2001/XMLSchema#nonNegativeInteger">1</owl:cardinality>
            </owl:Restriction>
        </owl:equivalentClass>
        <rdfs:label>employee class</rdfs:label>
    </owl:Class>
</rdf:RDF>

mytable.csv

"empid","empname"
1,"Arthur"
2,"Kevin"
3,"Joey"

mytable-model.ttl

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix pato: <http://purl.obolibrary.org/obo/pato#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix obo: <http://purl.obolibrary.org/obo/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix oboInOwl: <http://www.geneontology.org/formats/oboInOwl#> .
@prefix protege: <http://protege.stanford.edu/plugins/owl/protege#> .
@prefix turbo: <http://turbo.org/> .
@prefix subsets: <http://purl.obolibrary.org/obo/ro/subsets#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix km-dev: <http://isi.edu/integration/karma/dev#> .

_:node1bi2073ftx1 a km-dev:R2RMLMapping ;
    km-dev:sourceName "mytable.csv" ;
    km-dev:modelPublicationTime "1496863444477"^^xsd:long ;
    km-dev:modelVersion "1.7" ;
    km-dev:hasInputColumns "[[{\"columnName\":\"empname\"}],[{\"columnName\":\"empid\"}]]" ;
    km-dev:hasOutputColumns "[[{\"columnName\":\"empname\"}],[{\"columnName\":\"empid\"}],[{\"columnName\":\"empid_val\"}]]" ;
    km-dev:hasModelLabel "mytable" ;
    km-dev:hasBaseURI "https://stackoverflow.com/questions/44403425.owl/" ;
    km-dev:hasWorksheetHistory """[
    {
        \"commandName\": \"SubmitPythonTransformationCommand\",
        \"model\": \"new\",
        \"inputParameters\": [
            {
                \"name\": \"hNodeId\",
                \"type\": \"hNodeId\",
                \"value\": [{\"columnName\": \"empid\"}]
            },
            {
                \"name\": \"worksheetId\",
                \"type\": \"worksheetId\",
                \"value\": \"W\"
            },
            {
                \"name\": \"selectionName\",
                \"type\": \"other\",
                \"value\": \"DEFAULT_TEST\"
            },
            {
                \"name\": \"newColumnName\",
                \"type\": \"other\",
                \"value\": \"empid_val\"
            },
            {
                \"name\": \"transformationCode\",
                \"type\": \"other\",
                \"value\": \"return getValue(\\\"empid\\\")\"
            },
            {
                \"name\": \"errorDefaultValue\",
                \"type\": \"other\",
                \"value\": \"\"
            },
            {
                \"name\": \"isJSONOutput\",
                \"type\": \"other\",
                \"value\": \"false\"
            },
            {
                \"name\": \"inputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[{\\\"value\\\":[{\\\"columnName\\\":\\\"empid\\\"}]}]\"
            },
            {
                \"name\": \"outputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[{\\\"value\\\":[{\\\"columnName\\\":\\\"empid_val\\\"}]}]\"
            }
        ],
        \"tags\": [\"Transformation\"]
    },
    {
        \"commandName\": \"SetWorksheetPropertiesCommand\",
        \"model\": \"new\",
        \"inputParameters\": [
            {
                \"name\": \"worksheetId\",
                \"type\": \"worksheetId\",
                \"value\": \"W\"
            },
            {
                \"name\": \"selectionName\",
                \"type\": \"other\",
                \"value\": \"DEFAULT_TEST\"
            },
            {
                \"name\": \"properties\",
                \"type\": \"other\",
                \"value\": {
                    \"graphLabel\": \"\",
                    \"hasBaseURI\": false,
                    \"prefix\": \"mytable\",
                    \"hasPrefix\": true,
                    \"hasServiceProperties\": false
                }
            },
            {
                \"name\": \"inputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[]\"
            },
            {
                \"name\": \"outputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[]\"
            }
        ],
        \"tags\": [\"Transformation\"]
    },
    {
        \"commandName\": \"SetWorksheetPropertiesCommand\",
        \"model\": \"new\",
        \"inputParameters\": [
            {
                \"name\": \"worksheetId\",
                \"type\": \"worksheetId\",
                \"value\": \"W\"
            },
            {
                \"name\": \"selectionName\",
                \"type\": \"other\",
                \"value\": \"DEFAULT_TEST\"
            },
            {
                \"name\": \"properties\",
                \"type\": \"other\",
                \"value\": {
                    \"graphLabel\": \"\",
                    \"hasBaseURI\": true,
                    \"baseURI\": \"https://stackoverflow.com/questions/44403425.owl/\",
                    \"hasPrefix\": false,
                    \"hasServiceProperties\": false
                }
            },
            {
                \"name\": \"inputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[]\"
            },
            {
                \"name\": \"outputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[]\"
            }
        ],
        \"tags\": [\"Transformation\"]
    },
    {
        \"commandName\": \"SetWorksheetPropertiesCommand\",
        \"model\": \"new\",
        \"inputParameters\": [
            {
                \"name\": \"worksheetId\",
                \"type\": \"worksheetId\",
                \"value\": \"W\"
            },
            {
                \"name\": \"selectionName\",
                \"type\": \"other\",
                \"value\": \"DEFAULT_TEST\"
            },
            {
                \"name\": \"properties\",
                \"type\": \"other\",
                \"value\": {
                    \"graphLabel\": \"mytable\",
                    \"hasBaseURI\": false,
                    \"hasPrefix\": false,
                    \"hasServiceProperties\": false
                }
            },
            {
                \"name\": \"inputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[]\"
            },
            {
                \"name\": \"outputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[]\"
            }
        ],
        \"tags\": [\"Transformation\"]
    },
    {
        \"commandName\": \"SetMetaPropertyCommand\",
        \"model\": \"new\",
        \"inputParameters\": [
            {
                \"name\": \"hNodeId\",
                \"type\": \"hNodeId\",
                \"value\": [{\"columnName\": \"empid\"}]
            },
            {
                \"name\": \"worksheetId\",
                \"type\": \"worksheetId\",
                \"value\": \"W\"
            },
            {
                \"name\": \"selectionName\",
                \"type\": \"other\",
                \"value\": \"DEFAULT_TEST\"
            },
            {
                \"name\": \"metaPropertyName\",
                \"type\": \"other\",
                \"value\": \"isUriOfClass\"
            },
            {
                \"name\": \"metaPropertyUri\",
                \"type\": \"other\",
                \"value\": \"https://stackoverflow.com/questions/44403425.owl/employee\"
            },
            {
                \"name\": \"metaPropertyId\",
                \"type\": \"other\",
                \"value\": \"https://stackoverflow.com/questions/44403425.owl/employee1\"
            },
            {
                \"name\": \"SemanticTypesArray\",
                \"type\": \"other\",
                \"value\": [{
                    \"DomainUri\": \"https://stackoverflow.com/questions/44403425.owl/employee\",
                    \"DomainId\": \"https://stackoverflow.com/questions/44403425.owl/employee1\",
                    \"isPrimary\": true,
                    \"isProvenance\": false,
                    \"FullType\": \"http://isi.edu/integration/karma/dev#classLink\",
                    \"DomainLabel\": \"https://stackoverflow.com/questions/44403425.owl/employee/employee1 (add)\"
                }]
            },
            {
                \"name\": \"trainAndShowUpdates\",
                \"type\": \"other\",
                \"value\": true
            },
            {
                \"name\": \"rdfLiteralType\",
                \"type\": \"other\",
                \"value\": \"\"
            },
            {
                \"name\": \"language\",
                \"type\": \"other\",
                \"value\": \"\"
            },
            {
                \"name\": \"inputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[{\\\"value\\\":[{\\\"columnName\\\":\\\"empid\\\"}]}]\"
            },
            {
                \"name\": \"outputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[{\\\"value\\\":[{\\\"columnName\\\":\\\"empid\\\"}]}]\"
            }
        ],
        \"tags\": [\"SemanticType\"]
    },
    {
        \"commandName\": \"SetSemanticTypeCommand\",
        \"model\": \"new\",
        \"inputParameters\": [
            {
                \"name\": \"hNodeId\",
                \"type\": \"hNodeId\",
                \"value\": [{\"columnName\": \"empid_val\"}]
            },
            {
                \"name\": \"worksheetId\",
                \"type\": \"worksheetId\",
                \"value\": \"W\"
            },
            {
                \"name\": \"selectionName\",
                \"type\": \"other\",
                \"value\": \"DEFAULT_TEST\"
            },
            {
                \"name\": \"SemanticTypesArray\",
                \"type\": \"other\",
                \"value\": [{
                    \"DomainUri\": \"https://stackoverflow.com/questions/44403425.owl/employee\",
                    \"DomainId\": \"https://stackoverflow.com/questions/44403425.owl/employee1\",
                    \"isPrimary\": true,
                    \"isProvenance\": false,
                    \"FullType\": \"https://stackoverflow.com/questions/44403425.owl/empid\",
                    \"DomainLabel\": \"https://stackoverflow.com/questions/44403425.owl/employee/employee1\"
                }]
            },
            {
                \"name\": \"trainAndShowUpdates\",
                \"type\": \"other\",
                \"value\": true
            },
            {
                \"name\": \"rdfLiteralType\",
                \"type\": \"other\",
                \"value\": \"\"
            },
            {
                \"name\": \"language\",
                \"type\": \"other\",
                \"value\": \"\"
            },
            {
                \"name\": \"inputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[{\\\"value\\\":[{\\\"columnName\\\":\\\"empid_val\\\"}]}]\"
            },
            {
                \"name\": \"outputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[{\\\"value\\\":[{\\\"columnName\\\":\\\"empid_val\\\"}]}]\"
            }
        ],
        \"tags\": [\"SemanticType\"]
    },
    {
        \"commandName\": \"SetSemanticTypeCommand\",
        \"model\": \"new\",
        \"inputParameters\": [
            {
                \"name\": \"hNodeId\",
                \"type\": \"hNodeId\",
                \"value\": [{\"columnName\": \"empname\"}]
            },
            {
                \"name\": \"worksheetId\",
                \"type\": \"worksheetId\",
                \"value\": \"W\"
            },
            {
                \"name\": \"selectionName\",
                \"type\": \"other\",
                \"value\": \"DEFAULT_TEST\"
            },
            {
                \"name\": \"SemanticTypesArray\",
                \"type\": \"other\",
                \"value\": [{
                    \"DomainUri\": \"https://stackoverflow.com/questions/44403425.owl/employee\",
                    \"DomainId\": \"https://stackoverflow.com/questions/44403425.owl/employee1\",
                    \"isPrimary\": true,
                    \"isProvenance\": false,
                    \"FullType\": \"http://www.w3.org/2000/01/rdf-schema#label\",
                    \"DomainLabel\": \"https://stackoverflow.com/questions/44403425.owl/employee/employee1\"
                }]
            },
            {
                \"name\": \"trainAndShowUpdates\",
                \"type\": \"other\",
                \"value\": true
            },
            {
                \"name\": \"rdfLiteralType\",
                \"type\": \"other\",
                \"value\": \"\"
            },
            {
                \"name\": \"language\",
                \"type\": \"other\",
                \"value\": \"\"
            },
            {
                \"name\": \"inputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[{\\\"value\\\":[{\\\"columnName\\\":\\\"empname\\\"}]}]\"
            },
            {
                \"name\": \"outputColumns\",
                \"type\": \"hNodeIdList\",
                \"value\": \"[{\\\"value\\\":[{\\\"columnName\\\":\\\"empname\\\"}]}]\"
            }
        ],
        \"tags\": [\"SemanticType\"]
    }
]""" .

km-dev:TriplesMap_d397eb02-1069-4e93-a544-0f2c4d1f8970 a rr:TriplesMap .

_:node1bi2073ftx1 km-dev:hasTriplesMap km-dev:TriplesMap_d397eb02-1069-4e93-a544-0f2c4d1f8970 .

km-dev:TriplesMap_d397eb02-1069-4e93-a544-0f2c4d1f8970 km-dev:isPartOfMapping _:node1bi2073ftx1 .

_:node1bi2073ftx2 rr:tableName "mytable.csv" ;
    a rr:LogicalTable ;
    km-dev:isPartOfMapping _:node1bi2073ftx1 .

_:node1bi2073ftx1 km-dev:hasLogicalTable _:node1bi2073ftx2 .

km-dev:TriplesMap_d397eb02-1069-4e93-a544-0f2c4d1f8970 rr:logicalTable _:node1bi2073ftx2 ;
    rr:subjectMap _:node1bi2073ftx3 .

_:node1bi2073ftx1 km-dev:hasSubjectMap _:node1bi2073ftx3 .

_:node1bi2073ftx3 km-dev:isPartOfMapping _:node1bi2073ftx1 ;
    a rr:SubjectMap ;
    km-dev:alignmentNodeId "https://stackoverflow.com/questions/44403425.owl/employee1" ;
    rr:class <https://stackoverflow.com/questions/44403425.owl/employee> ;
    rr:template "{empid}" ;
    a km-dev:steinerTreeRootNode .

km-dev:PredicateObjectMap_5a63b356-1bbb-4b1f-b64a-514c19f83d28 rr:predicate <https://stackoverflow.com/questions/44403425.owl/empid> .

_:node1bi2073ftx4 rr:column "empid" ;
    a rr:ObjectMap ;
    km-dev:isPartOfMapping _:node1bi2073ftx1 .

_:node1bi2073ftx1 km-dev:hasObjectMap _:node1bi2073ftx4 .

km-dev:PredicateObjectMap_5a63b356-1bbb-4b1f-b64a-514c19f83d28 rr:objectMap _:node1bi2073ftx4 .

km-dev:TriplesMap_d397eb02-1069-4e93-a544-0f2c4d1f8970 rr:predicateObjectMap km-dev:PredicateObjectMap_5a63b356-1bbb-4b1f-b64a-514c19f83d28 .

km-dev:PredicateObjectMap_5a63b356-1bbb-4b1f-b64a-514c19f83d28 a rr:PredicateObjectMap ;
    km-dev:isPartOfMapping _:node1bi2073ftx1 .

_:node1bi2073ftx1 km-dev:hasPredicateObjectMap km-dev:PredicateObjectMap_5a63b356-1bbb-4b1f-b64a-514c19f83d28 .

km-dev:PredicateObjectMap_028e84af-9ed2-406c-81d0-9add2ed2ca2b rr:predicate rdfs:label .

_:node1bi2073ftx5 rr:column "empname" ;
    a rr:ObjectMap ;
    km-dev:isPartOfMapping _:node1bi2073ftx1 .

_:node1bi2073ftx1 km-dev:hasObjectMap _:node1bi2073ftx5 .

km-dev:PredicateObjectMap_028e84af-9ed2-406c-81d0-9add2ed2ca2b rr:objectMap _:node1bi2073ftx5 .

km-dev:TriplesMap_d397eb02-1069-4e93-a544-0f2c4d1f8970 rr:predicateObjectMap km-dev:PredicateObjectMap_028e84af-9ed2-406c-81d0-9add2ed2ca2b .

km-dev:PredicateObjectMap_028e84af-9ed2-406c-81d0-9add2ed2ca2b a rr:PredicateObjectMap ;
    km-dev:isPartOfMapping _:node1bi2073ftx1 .

_:node1bi2073ftx1 km-dev:hasPredicateObjectMap km-dev:PredicateObjectMap_028e84af-9ed2-406c-81d0-9add2ed2ca2b .