I have a question in my project.
I do not know whether I need to work netbeans or not.
My work is about library book of recommendation systems . that as input I need book Classification ontology . in my ontology classify library books. this classification has 14 categories, beside the sibling classes Author, book, Isbn. Individuals in book class are book’s subject(about 600 subjects) , and individuals in author class are name’s author and also isbn class.
also I collected and Have got in part of belong book to categories manually. That a object properties is name “hasSubject” related individual book class with categories. Example book “A” hasSubject Categories “S” and “F” and….
But as a finally result I want to apply this formula:
sim(x,y)=(C1,1)/(C1,0+ C0,1+ C1,1)
where C1,1 represents the number of categories that book “X” and book”Y” belongs it.(they)
and C1,0 represents the number of categories that book “X” belongs them but book “Y” does not belong them.
And C0,1 represents the number of categories that book “y” belongs them but book “x” does not belong them.
Finally Similarity is obtained between two book (“A”and”B”) . no again apply this formula to book”A” and book”C” and so on. Until Similarity is obtained between all books.
Now Your opinion this work done by netbeans or sparql in protégé?
I think that maybe I tell that if I make hasSibinling properties that represented, in every book Compute The group has shared the books with her.( What do you think I am)
You can compute this kind of metric using SPARQL, though it's a bit ugly. Let's assume some data like this:
prefix dcterms: <http://purl.org/dc/terms/>
prefix : <http://example.org/books/>
:book1 a :Book ; dcterms:subject :subject1 , :subject2, :subject3 .
:book2 a :Book ; dcterms:subject :subject2 , :subject3, :subject4 .
:book3 a :Book ; dcterms:subject :subject4 , :subject5 .
There are three books. Books 1 and 2 have two subjects in common, and one each that the other does not have. Books 2 and 3 have one subject in common, but Book 2 has 2 that Book 3 does not have, while Book 3 has only one that Book 2 does not have, Books 1 and 3 have no subjects in common.
The trick here is to use some nested subqueries, and to grab the different values (C10, C01, and C11) at different levels in the nesting. The innermost query is
select ?book1 ?book2 (count(?left) as ?c10) where {
:Book ^a ?book1, ?book2 .
FILTER( !sameTerm(?book1,?book2) )
OPTIONAL {
?book1 dcterms:subject ?left .
FILTER NOT EXISTS { ?book2 dcterms:subject ?left }
}
}
group by ?book1 ?book2
which grabs each pair of distinct books and computes the number of subjects that the left book has that the right doesn't. By wrapping this in another query, we can then grab the number of subjects that the right book has that the left doesn't. This makes the query:
select ?book1 ?book2 (count(?right) as ?c01x) (sample(?c10) as ?c10x) where {
{
select ?book1 ?book2 (count(?left) as ?c10) where {
:Book ^a ?book1, ?book2 .
FILTER( !sameTerm(?book1,?book2) )
OPTIONAL {
?book1 dcterms:subject ?left .
FILTER NOT EXISTS { ?book2 dcterms:subject ?left }
}
}
group by ?book1 ?book2
}
OPTIONAL {
?book2 dcterms:subject ?right .
FILTER NOT EXISTS { ?book1 dcterms:subject ?right }
}
}
group by ?book1 ?book2
Note that we still have to select ?book1
and ?book2
, and sample(?c10) as ?c10x
in order to pass the values outward. (We have to use ?c10x
because the name ?c10
has already been used at this scope. Finally, we wrap this in one more query to get the common subjects, and to do the computation, which gives us:
prefix dcterms: <http://purl.org/dc/terms/>
prefix : <http://example.org/books/>
select ?book1 ?book2
(count(?both) as ?c11)
(sample(?c10x) as ?c10)
(sample(?c01x) as ?c01)
(count(?both) / (count(?both) + sample(?c10x) + sample(?c01x)) as ?sim)
where {
{
select ?book1 ?book2 (count(?right) as ?c01x) (sample(?c10) as ?c10x) where {
{
select ?book1 ?book2 (count(?left) as ?c10) where {
:Book ^a ?book1, ?book2 .
FILTER( !sameTerm(?book1,?book2) )
OPTIONAL {
?book1 dcterms:subject ?left .
FILTER NOT EXISTS { ?book2 dcterms:subject ?left }
}
}
group by ?book1 ?book2
}
OPTIONAL {
?book2 dcterms:subject ?right .
FILTER NOT EXISTS { ?book1 dcterms:subject ?right }
}
}
group by ?book1 ?book2
}
OPTIONAL {
?both ^dcterms:subject ?book1, ?book2 .
}
}
group by ?book1 ?book2
order by ?book1 ?book2
This rather monstrous query, applied to our data, computes these results:
$ arq --data data.n3 --query similarity.sparql
--------------------------------------------
| book1 | book2 | c11 | c10 | c01 | sim |
============================================
| :book1 | :book2 | 2 | 1 | 1 | 0.5 |
| :book1 | :book3 | 0 | 3 | 2 | 0.0 |
| :book2 | :book1 | 2 | 1 | 1 | 0.5 |
| :book2 | :book3 | 1 | 2 | 1 | 0.25 |
| :book3 | :book1 | 0 | 2 | 3 | 0.0 |
| :book3 | :book2 | 1 | 1 | 2 | 0.25 |
--------------------------------------------
If the FILTER( !sameTerm(?book1,?book2) )
line is removed, so that similarity of each book to itself is also computed, we see the correct value (1.0):
$ arq --data data.n3 --query similarity.sparql
--------------------------------------------
| book1 | book2 | c11 | c10 | c01 | sim |
============================================
| :book1 | :book1 | 3 | 0 | 0 | 1.0 |
| :book1 | :book2 | 2 | 1 | 1 | 0.5 |
| :book1 | :book3 | 0 | 3 | 2 | 0.0 |
| :book2 | :book1 | 2 | 1 | 1 | 0.5 |
| :book2 | :book2 | 3 | 0 | 0 | 1.0 |
| :book2 | :book3 | 1 | 2 | 1 | 0.25 |
| :book3 | :book1 | 0 | 2 | 3 | 0.0 |
| :book3 | :book2 | 1 | 1 | 2 | 0.25 |
| :book3 | :book3 | 2 | 0 | 0 | 1.0 |
--------------------------------------------
If you don't need to preserve the various Cmn values, then you might be able to optimize this, e.g., by computing C01 in the innermost query, and the C10 in the next to middle query, but then instead of projecting both up individually, product just their sum (C10+C01) so that in the outermost query where you compute C11, you can just do (C11 / (C11 + (C10+C01))).