What's the simplest (and hopefully not too slow) way to calculate the median with MySQL? I've used AVG(x)
for finding the mean, but I'm having a hard time finding a simple way of calculating the median. For now, I'm returning all the rows to PHP, doing a sort, and then picking the middle row, but surely there must be some simple way of doing it in a single MySQL query.
Example data:
id | val
--------
1 4
2 7
3 2
4 2
5 9
6 8
7 3
Sorting on val
gives 2 2 3 4 7 8 9
, so the median should be 4
, versus SELECT AVG(val)
which == 5
.
The above seems to work for me.
Often, we may need to calculate Median not just for the whole table, but for aggregates with respect to our ID. In other words, calculate median for each ID in our table, where each ID has many records. (good performance and works in many SQL + fixes problem of even and odds, more about performance of different Median-methods https://sqlperformance.com/2012/08/t-sql-queries/median )
Hope it helps
My code, efficient without tables or additional variables:
In MariaDB / MySQL:
Steve Cohen points out, that after the first pass, @rownum will contain the total number of rows. This can be used to determine the median, so no second pass or join is needed.
Also
AVG(dd.val)
anddd.row_number IN(...)
is used to correctly produce a median when there are an even number of records. Reasoning:Finally, MariaDB 10.3.3+ contains a MEDIAN function
I propose a faster way.
Get the row count:
SELECT CEIL(COUNT(*)/2) FROM data;
Then take the middle value in a sorted subquery:
SELECT max(val) FROM (SELECT val FROM data ORDER BY val limit @middlevalue) x;
I tested this with a 5x10e6 dataset of random numbers and it will find the median in under 10 seconds.
This way seems include both even and odd count without subquery.