Imagine you are querying a data source via an SPARQL endpoint and you want to know if the underlying representation of this data source is OWL or RDF/XML. Is there anyway that you would be able to do that via a SPARQL query? My personal line of thought was to write a query that uses one of the OWL properties and see if that returns any result, however the disadvantage of using such approach is that if you use an OWL property that doesn't appear in the data source even if the underlying representation is OWL you would not get a response. The assumption here is that your don't have access to the schema.
问题:
回答1:
I think an interesting approach to this would be to write a query that gets all the triples that involve some of the reserved URIs for the schema languages that you're concerned with, which should hopefully give you the schema or ontology. E.g., §2.4 IRIs from the OWL specification gives a list of reserved IRIs for OWL. A query like this would hopefully give you most of the information that you need:
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
construct { ?s ?p ?o }
where {
values ?res {
owl:backwardCompatibleWith owl:bottomDataProperty owl:bottomObjectProperty owl:deprecated owl:incompatibleWith
owl:Nothing owl:priorVersion owl:rational owl:real owl:versionInfo
owl:Thing owl:topDataProperty owl:topObjectProperty rdf:langRange rdf:PlainLiteral
rdf:XMLLiteral rdfs:comment rdfs:isDefinedBy rdfs:label rdfs:Literal
rdfs:seeAlso xsd:anyURI xsd:base64Binary xsd:boolean xsd:byte
xsd:dateTime xsd:dateTimeStamp xsd:decimal xsd:double xsd:float
xsd:hexBinary xsd:int xsd:integer xsd:language xsd:length
xsd:long xsd:maxExclusive xsd:maxInclusive xsd:maxLength xsd:minExclusive
xsd:minInclusive xsd:minLength xsd:Name xsd:NCName xsd:negativeInteger
xsd:NMTOKEN xsd:nonNegativeInteger xsd:nonPositiveInteger xsd:normalizedString xsd:pattern
xsd:positiveInteger xsd:short xsd:string xsd:token xsd:unsignedByte
xsd:unsignedInt xsd:unsignedLong xsd:unsignedShort
}
{ ?res ?p ?o . bind( ?res as ?s ) } union
{ ?s ?res ?o . bind( ?res as ?p ) } union
{ ?s ?p ?res . bind( ?res as ?o ) }
}
Similarly, you might use one like this the following for extracting RDFS schemata. The list of reserved properties here is based on §6. RDF Schema summary (Informative) from the RDFS recommendation. I removed rdf:type
, though, because there'd always be lots of it.
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
construct { ?s ?p ?o }
where {
values ?res {
rdfs:Resource rdfs:Literal rdf:XMLLiteral rdfs:Class rdf:Property
rdfs:Datatype rdf:Statement rdf:Bag rdf:Seq rdf:Alt rdfs:Container
rdfs:ContainerMembershipProperty rdf:List rdfs:subClassOf
rdfs:subPropertyOf rdfs:domain rdfs:range rdfs:label rdfs:comment
rdfs:member rdf:first rdf:rest rdfs:seeAlso rdfs:isDefinedBy
rdf:value rdf:subject rdf:predicate rdf:object
}
{ ?res ?p ?o . bind( ?res as ?s ) } union
{ ?s ?res ?o . bind( ?res as ?p ) } union
{ ?s ?p ?res . bind( ?res as ?o ) }
}
If you run both of these queries against a dataset you can probably make an educated guess about how the data is structured. Note that many of the RDFS properties are also used by OWL, so a rough heuristic might be:
- If the OWL query returns significantly more information than the RDFS query, then the data probably uses an OWL ontology; otherwise, it probably uses an RDFS schema (or no schema at all).