What's the simplest (and hopefully not too slow) way to calculate the median with MySQL? I've used AVG(x)
for finding the mean, but I'm having a hard time finding a simple way of calculating the median. For now, I'm returning all the rows to PHP, doing a sort, and then picking the middle row, but surely there must be some simple way of doing it in a single MySQL query.
Example data:
id | val
--------
1 4
2 7
3 2
4 2
5 9
6 8
7 3
Sorting on val
gives 2 2 3 4 7 8 9
, so the median should be 4
, versus SELECT AVG(val)
which == 5
.
Based on @bob's answer, this generalizes the query to have the ability to return multiple medians, grouped by some criteria.
Think, e.g., median sale price for used cars in a car lot, grouped by year-month.
Install and use this mysql statistical functions: http://www.xarg.org/2012/07/statistical-functions-in-mysql/
After that, calculate median is easy:
SELECT median( x ) FROM t1
If MySQL has ROW_NUMBER, then the MEDIAN is (be inspired by this SQL Server query):
The IN is used in case you have an even number of entries.
If you want to find the median per group, then just PARTITION BY group in your OVER clauses.
Rob
Unfortunately, neither TheJacobTaylor's nor velcro's answers return accurate results for current versions of MySQL.
Velcro's answer from above is close, but it does not calculate correctly for result sets with an even number of rows. Medians are defined as either 1) the middle number on odd numbered sets, or 2) the average of the two middle numbers on even number sets.
So, here's velcro's solution patched to handle both odd and even number sets:
To use this, follow these 3 easy steps:
I found the accepted solution didn't work on my MySQL install, returning an empty set, but this query worked for me in all situations that I tested it on:
I just found another answer online in the comments:
Make sure your columns are well indexed and the index is used for filtering and sorting. Verify with the explain plans.
Calculate the "median" row number. Maybe use:
median_row = floor(count / 2)
.Then pick it out of the list:
This should return you one row with just the value you want.
Jacob