Return multiple relationship counts for one MATCH

2019-05-18 13:37发布

问题:

I want to do something like this:

MATCH (p:person)-[a:UPVOTED]->(t:topic),(p:person)-[b:DOWNVOTED]->(t:topic),(p:person)-[c:FLAGGED]->(t:topic) WHERE ID(t)=4 RETURN COUNT(a),COUNT(b),COUNT(c)

..but I get all 0 counts when I should get 2, 1, 1

回答1:

A better solution is to use size which improve drastically the performance of the query :

MATCH (t:Topic)
WHERE id(t) = 4
RETURN size((t)<-[:DOWNVOTED]-(:Person)) as downvoted,
       size((t)<-[:UPVOTED]-(:Person)) as upvoted,
       size((t)<-[:FLAGGED]-(:Person)) as flagged

If you are sure that the other nodes on the relationships are always labelled with Person, you can remove them from the query and it will be a bit faster again



回答2:

Let's start with refactoring the query a bit (hopefully the meaning of it isn't lost):

MATCH
  (t:topic)
  (p:person)-[upvote:UPVOTED]-(t),
  (p:person)-[downvote:DOWNVOTED]->(t),
  (p:person)-[flag:FLAGGED]->(t)
WHERE ID(t)=4
RETURN COUNT(upvote), COUNT(downvote), COUNT(flag)

Since t is your primary variable (since you are filtering on it), I've matched once with the label and then used just the variable throughout the rest of the matches. Seeing the query cleaned up like this, it seems to me that you're trying to count all upvotes/downvotes/flags for a topic, but you don't care who did those things. Currently, since you're using the same variable p Cypher is going to try to match the same person for all three lines. So you could have different variables:

 (p1:person)-[upvote:UPVOTED]-(t),
 (p2:person)-[downvote:DOWNVOTED]->(t),
 (p3:person)-[flag:FLAGGED]->(t)

Or better, since you're not referencing the people anywhere else, you can just leave the variables out:

(:person)-[upvote:UPVOTED]-(t),
(:person)-[downvote:DOWNVOTED]->(t),
(:person)-[flag:FLAGGED]->(t)

And stylistically, I would also suggest starting your matches with the item that you're filtering on:

(t)<-[upvote:UPVOTED]-(:person)    
(t)<-[downvote:DOWNVOTED]-(:person)    
(t)<-[flag:FLAGGED]-(:person)    

The next problem comes in because by making these a MATCH, you're saying that there NEEDS to be a match. Which means you'll never get cases with zeros. So you'll want OPTIONAL MATCH:

MATCH (t:topic)
WHERE ID(t)=4
OPTIONAL MATCH (t)<-[upvote:UPVOTED]-(:person)    
OPTIONAL MATCH (t)<-[downvote:DOWNVOTED]-(:person)    
OPTIONAL MATCH (t)<-[flag:FLAGGED]-(:person)    
RETURN COUNT(upvote), COUNT(downvote), COUNT(flag)

Even then, though what you're saying is: "Find a topic and find all cases where there is 1 upvote, no downvote, no flag, 1 upvote, 1 downvote, no flag, etc... to all permutations). That means you'll want to COUNT one at a time:

MATCH (t:topic)
WHERE ID(t)=4
OPTIONAL MATCH (t)<-[r:UPVOTED]-(:person)    
WITH t, COUNT(r) AS upvotes

OPTIONAL MATCH (t)<-[r:DOWNVOTED]-(:person)    
WITH t, upvotes, COUNT(r) AS downvotes

OPTIONAL MATCH (t)<-[r:FLAGGED]-(:person)    
RETURN upvotes, downvotes, COUNT(r) AS flags

A couple of miscellaneous items:

Be careful about using Neo IDs as a long-term reference because they can be recycled.

Use parameters whenever possible for performance / security (WHERE ID(t)={topic_id})

Also, labels are generally TitleCase. See The Zen of Cypher guide.



回答3:

Check this query, i think it will help you.

MATCH (p:person)-[a:UPVOTED]->(t:topic),
(p)-[b:DOWNVOTED]->(t),(p)-[c:FLAGGED]->(t) 
WHERE ID(t)=4 
RETURN COUNT(a) as a_count,COUNT(b) as b_count,COUNT(c) as c_count;



回答4:

Your current MATCH requires that the same person node (identified by p) have relationships of all 3 types with t. This is because an identifier is bound to a specific node (or relationship, or value), and (unless hidden by a WITH clause, which you do not have in your query) will reference that same node (or relationship, or value) throughout a query.

Based on your expected results, I am assuming that you are just trying to count the number of relationships of those 3 types between any person and t. If so, this is a performant way to do that:

MATCH (t:topic)
WHERE ID(t) = 4
MATCH (:person)-[r:UPVOTED|DOWNVOTED|FLAGGED]->(t)
RETURN REDUCE(s=[0,0,0], x IN COLLECT(r) |
  CASE TYPE(x)
    WHEN 'UPVOTED' THEN [s[0]+1, s[1], s[2]]
    WHEN 'DOWNVOTED' THEN [s[0], s[1]+1, s[2]]
    ELSE [s[0], s[1], s[2]+1]
  END
) As res;

res is an array with the number of UPVOTED, DOWNVOTED, and FLAGGED relationships, respectively, between any person and t.

Another approach would be to use separate OPTIONAL MATCH statements for each relationship type, returning three COUNT(DISTINCT x) values. But the above query uses a single MATCH statement, greatly reducing the number of DB hits, which are generally expensive.



标签: neo4j cypher