I have 2 nodes created in graph database, origin airport and destination airport. It is related by a property named 'delayed_by'.
MATCH (origin:origin_airport {name: row.ORIGIN}),
(destination:dest_airport {name: row.DEST})
CREATE (origin)-[:delayed_by {carr_delay: row.carr_delay}]->(destination)
CREATE (origin)-[:delayed_by {weather_delay: row.weather_delay}]->
(destination)
CREATE (origin)-[:delayed_by {nas_delay: row.nas_delay}]->(destination)
delayed_by holds the value delays caused due to carrier delay, weather delay and nas delay. Here I need to group_by origin and destination and calculate the average value of sum of all 3 delays using cypher query in Neo4j. Representing in tabular format, my input file will of the format as described below:
ORIGIN DEST carr_delay weather_delay nas_delay
ABE ATL 492 56 56
ABE DTW 412 0 47
ABQ ATL 181 0 218
I am expecting the result in the below format.
ORIGIN DEST Avg_delay
ABE ATL 201.33
ABE DTW 153
ABQ ATL 133
I am using the below query:
MATCH (oa:origin_airport)-[d:delayed_by]->(da:dest_airport)
RETURN oa.name AS Origin, da.name AS Destination,
AVG((toFloat(d.carr_delay))+(toFloat(d.weather_delay))+
(toFloat(d.nas_delay))) As avg_delay
ORDER BY avg_delay DESC
LIMIT 10
But getting null values for Avg_delay.
ORIGIN DEST Avg_delay
ABE ATL NULL
ABE DTW NULL
ABQ ATL NULL
1) I recommend to change the model: for the delay relationship add the property type of delay, and the delay value save in the property value.
2) And the desired query:
The problem here is that not for every relationships you have all properties(
carr_delay
,weather_delay
,nas_delay
) set.A non existing property evaluates to
NULL
- which is different from0
.AVG(1+NULL)
evaluate toNULL
. So you need to mapNULL
to0
. To fix that consider usingcoalesce
: