Cypher: Hierarchical Sorting

2019-06-06 10:11发布

问题:

I am new to Cypher. I was able to create a network of geographies (from the World to continents to countries to regions) and their population.

You can reproduce it with this command or check the link to the console: http://console.neo4j.org/r/dkh90c

CREATE (n1:Geo {name:'World'}),
(n2:Geo {name:'EMEA'})-[:BELONG_TO]->(n1),
(n4:Geo {name:'NORAM'})-[:BELONG_TO]->(n1),
(n5:Geo {name:'Middle East'})-[:BELONG_TO]->(n2),
(n6:Geo {name:'Africa'})-[:BELONG_TO]->(n2),
(n7:Geo {name:'Europe'})-[:BELONG_TO]->(n2),
(n8:Geo {name:'France'})-[:BELONG_TO]->(n7),
(n9:Geo {name:'Germany'})-[:BELONG_TO]->(n7),
(n10:Geo {name:'Italy'})-[:BELONG_TO]->(n7),
(n11:Geo {name:'United Kingdom'})-[:BELONG_TO]->(n7),
(n12:Geo {name:'England'})-[:BELONG_TO]->(n11),
(n13:Geo {name:'Scotland'})-[:BELONG_TO]->(n11),
(n14:Geo {name:'Wales'})-[:BELONG_TO]->(n11),
(n15:Geo {name:'Northern Ireland'})-[:BELONG_TO]->(n11),
(n16:Geo {name:'United Arab Emirates'})-[:BELONG_TO]->(n5),
(n17:Geo {name:'South Africa'})-[:BELONG_TO]->(n6),
(n18:Geo {name:'Canada'})-[:BELONG_TO]->(n4),
(n19:Geo {name:'United States of America'})-[:BELONG_TO]->(n4),
(n20:Geo {name:'Mexico'})-[:BELONG_TO]->(n4),

(:Population {year:'2014',amount:66.1})-[:LIVE_IN]->(n8),
(:Population {year:'2014',amount:81.2})-[:LIVE_IN]->(n9),
(:Population {year:'2013',amount:59.83})-[:LIVE_IN]->(n10),
(:Population {year:'2011',amount:53.01})-[:LIVE_IN]->(n12),
(:Population {year:'2011',amount:5.295})-[:LIVE_IN]->(n13),
(:Population {year:'2011',amount:3.063})-[:LIVE_IN]->(n14),
(:Population {year:'2011',amount:1.811})-[:LIVE_IN]->(n15),
(:Population {year:'2013',amount:9.346})-[:LIVE_IN]->(n16),
(:Population {year:'2013',amount:52.98})-[:LIVE_IN]->(n17),
(:Population {year:'2013',amount:35.16})-[:LIVE_IN]->(n18),
(:Population {year:'2014',amount:318.9})-[:LIVE_IN]->(n19),
(:Population {year:'2013',amount:122.3})-[:LIVE_IN]->(n20)

I am also able to calculate the total population for each geography with this command:

MATCH (n:Population)-[r:LIVE_IN]->(g1:Geo)-[:BELONG_TO*0..]->(g2:Geo)
RETURN g2.name AS Geography, SUM(toFloat(n.amount)) AS Population
ORDER BY Population DESC

However, I am not happy with the sorting of the results:

As you can see, the United States of America are inserted between EMEA and Europe. Is there a way to first sort by the 'BELONG_TO' hierarchy before displaying the results? Also, despite all my data having only up to 3 decimals, I don't understand why the SUM() command return a crazy number of decimals.

Thanks for your help.

回答1:

Sure, you should be able to sort by the length of the path:

MATCH path=(n:Population)-[r:LIVE_IN]->(g1:Geo)-[:BELONG_TO*0..]->(g2:Geo)
RETURN g2.name AS Geography, SUM(toFloat(n.amount)) AS Population
ORDER BY length(path) ASC, Population DESC

I think that you're getting lots of decimals because of floating point math. I think you should be able to do:

ROUND(SUM(toFloat(n.amount)) * 1000.0) / 1000.0

EDIT:

You're right, you need to add the path length to order by it:

MATCH path=(n:Population)-[r:LIVE_IN]->(g1:Geo)-[:BELONG_TO*1..]->(g2:Geo)
RETURN g1.name AS Geography, SUM(toFloat(n.amount)) AS Population, length(path) AS path_length
ORDER BY length(path) ASC, Population DESC 

And yes, you're definitely right that your path needs to end with g2 being the world. You can do this by matching (g2:Geo {name: 'World'}) similar to how I suggested in the comments, or if you, in the future, will happen to have more root nodes (perhaps we'll colonize the moon or Mars!), you could do this:

MATCH path=(n:Population)-[r:LIVE_IN]->(g1:Geo)-[:BELONG_TO*]->(g2:Geo)
WHERE NOT((g2:Geo)-[:BELONG_TO]->())
RETURN g1.name AS Geography, SUM(toFloat(n.amount)) AS Population, length(path) AS path_length
ORDER BY length(path) ASC, Population DESC 

That means that we only want paths where g2 doesn't belong to anything. Otherwise g2 can be, for example, Europe



标签: neo4j cypher