What's the simplest (and hopefully not too slow) way to calculate the median with MySQL? I've used AVG(x)
for finding the mean, but I'm having a hard time finding a simple way of calculating the median. For now, I'm returning all the rows to PHP, doing a sort, and then picking the middle row, but surely there must be some simple way of doing it in a single MySQL query.
Example data:
id | val
--------
1 4
2 7
3 2
4 2
5 9
6 8
7 3
Sorting on val
gives 2 2 3 4 7 8 9
, so the median should be 4
, versus SELECT AVG(val)
which == 5
.
Building off of velcro's answer, for those of you having to do a median off of something that is grouped by another parameter:
My solution presented below works in just one query without creation of table, variable or even sub-query. Plus, it allows you to get median for each group in group-by queries (this is what i needed !):
It works because of a smart use of group_concat and substring_index.
But, to allow big group_concat, you have to set group_concat_max_len to a higher value (1024 char by default). You can set it like that (for current sql session) :
More infos for group_concat_max_len: https://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_group_concat_max_len
Takes care about an odd value count - gives the avg of the two values in the middle in that case.
Here is my way . Of course, you could put it into a procedure :-)
You could avoid the variable
@median_counter
, if you substitude it:Taken from: http://mdb-blog.blogspot.com/2015/06/mysql-find-median-nth-element-without.html
I would suggest another way, without join, but working with strings
i did not checked it with tables with large data, but small/medium tables it works just fine.
The good thing here, that it works also by GROUPING so it can return the median for several items.
here is test code for test table:
and the code for finding the median for each group:
Output:
A comment on this page in the MySQL documentation has the following suggestion: