I have a couple lines of (I think) RDF data
<http://www.test.com/meta#0001> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class>
<http://www.test.com/meta#0002> <http://www.test.com/meta#CONCEPT_hasType> "BEAR"^^<http://www.w3.org/2001/XMLSchema#string>
Each line has 3 items in it. I want to pull out the item before and after the URL. So that would result in:
0001, type, Class
0002, CONCEPT_hasType, (BEAR, string)
Is there a library out there (java or scala) that would do this split for me? Or do I just need to shove string.splits and assumptions in my code?
Most RDF libraries will have something to facilitate this. For example, if you parse your RDF data using Eclipse RDF4J's Rio parser, you will get back each line as a
org.eclipse.rdf4j.model.Statement
, with a subject, predicate and object value. The subject in both your lines will be anorg.eclipse.rdf4j.model.IRI
, which has agetLocalName()
method you can use to get the part behind the last #. See the Javadocs for more details.Assuming your data is in N-Triples syntax (which it seems to be given the example you showed us), here's a simple bit of code that does this and prints it out to STDOUT:
Update if you don't want to read data from a file but simply use a
String
, you could just use ajava.io.StringReader
instead of anInputStream
:Alternatively, if you don't want to parse the data at all and just want to do String processing, there is a org.eclipse.rdf4j.model,URIUtil class which you can just feed a string and it can give you back the index of the local name part:
(disclosure: I am on the RDF4J development team)