Loop in cypher, UNWIND or FOREACH (Neo4j)

2019-08-24 04:42发布

问题:

I have a relation in my neo4j database :

(r:RateableEntity)<-[t:TAG]-(h:HashTags)

Now I want to have a query that returns a list that includes:

  1. A list of hashtagName and their frequency in the database as hashtagCount and a list of items that related to this hashtags. hashtagName and hashtagItems have id label.

Note: I'm receiving the number of hashtag and hashtagItems from input parameter as variable.

And this is the result that I expected from my cypher query:

"hashtagList": [ 
{
  "hashtagName": "hashtagName1",
  "hashtagCount": number of times hashtag has been used in database,
  "hashtagItems": [ list of relevant items for hashtagName1 ]
},
{
  "hashtagName": "hashtagName2",
  "hashtagCount": number of times hashtag has been used in database,
  "hashtagItems": [ list of relevant items for hashtagName2 ]
},
...
]

I've written this cypher:

MATCH p = (r:RateableEntity)<-[t:TAG]-(h:HashTag)
UNWIND TAIL (NODES(p)) AS hash
WITH COUNT(hash) as Count, h, hash
ORDER BY hash LIMIT 3
WHERE h.tag in hash.tag
MATCH (r:RateableEntity)<-[:TAG]-(h:HashTag)
 RETURN DISTINCT h.tag, r.id, Count
 LIMIT 3

but It's returning this result:

h.tag       r.id                                  Count
"vanessa"   "cdd14968-404c-41e9-84d5-bf147030a023"  15
"vanessa"   "b7e74f38-44e4-4b7f-b2c4-8301023ffa9b"  15
"vanessa"   "2064d3e4-2995-4202-b178-bb2a6f230ab0"  15

Thanks in advanced for help.

回答1:

Some things to keep in mind:

  1. Cypher operators execute for each row.

  2. Try not to think of UNWIND as a looping structure. All this does is do a cartesian product of the variables on a row with the elements of a list.

So when you UNWIND a list, you will have a row for each element of the list, along with all the variables that were already present for the row. Then when a subsequent operation happens (like a MATCH or a WITH) that executes for every row, so it seems like a looping structure, but it really isn't.

In any case, UNWIND isn't needed here. For a two-node matched pattern, tail(nodes(p)) will just be a single-element list containing just the last node. It hasn't changed the number of rows (since the list size is 1), and won't help you here.

This query should work better:

MATCH (h:HashTag)
WITH h LIMIT 3 // best to limit early to avoid doing unnecessary work
WITH h, h.tag as hashtagName, size((h)-[:TAG]->()) as hashtagCount, [(h)-[:TAG]->(r:RateableEntity) | r.id] as hashtagItems
WITH h {hashtagName, hashtagCount, hashtagItems} as entry
RETURN collect(entry) as hashtagList

EDIT

If you want the top 3 hashtags by size, then you can use the modified query below:

MATCH (h:HashTag)
WITH h, size((h)-[:TAG]->()) as hashtagCount
ORDER BY hashtagCount DESC
LIMIT 3
WITH h, hashtagCount, h.tag as hashtagName, [(h)-[:TAG]->(r:RateableEntity) | r.id] as hashtagItems
WITH h {hashtagName, hashtagCount, hashtagItems} as entry
RETURN collect(entry) as hashtagList


回答2:

I've found this for my question, maybe someone else need to know:

MATCH (:RateableEntity)<-[:TAG]-(p:HashTag)
RETURN p.tag As Tag, COUNT(p) as Count, [(p)-[:TAG]->(m) | m.id][..3] AS 
RateableEntities ORDER BY Count DESC LIMIT 3

Here is the link for documents on the website:

limiting-match-results-per-row/



标签: neo4j cypher