I'm using SQL Server 2008 R2. I need to compute a percentile value per group, something like:
SELECT id,
PCTL(0.9, x) -- for the 90th percentile
FROM my_table
GROUP BY id
ORDER BY id
For example, given this DDL (fiddle) ---
CREATE TABLE my_table (id INT, x REAL);
INSERT INTO my_table
VALUES (7, 0.164595), (5, 0.671311), (7, 0.0118385), (6, 0.704592), (3, 0.633521), (3, 0.337268), (0, 0.54739), (6, 0.312282), (0, 0.220618), (7, 0.214973), (6, 0.410768), (7, 0.151572), (7, 0.0639506), (5, 0.339075), (1, 0.284094), (2, 0.126722), (2, 0.870079), (3, 0.369366), (1, 0.6687), (5, 0.199456), (5, 0.0296715), (1, 0.330339), (9, 0.0000459612), (5, 0.391947), (3, 0.753965), (8, 0.334207), (7, 0.583357), (3, 0.326951), (4, 0.207057), (2, 0.258463), (2, 0.0532811), (1, 0.751584), (7, 0.592624), (7, 0.673506), (5, 0.44764), (6, 0.733737), (5, 0.141215), (7, 0.222452), (3, 0.597019), (1, 0.293901), (4, 0.516213), (7, 0.498336), (6, 0.410461), (2, 0.32211), (1, 0.466735), (5, 0.720456), (8, 0.000428383), (3, 0.46085), (0, 0.402963), (7, 0.677002), (0, 0.400122), (1, 0.762357), (9, 0.158455), (7, 0.359723), (4, 0.225914), (7, 0.795345), (6, 0.902261), (2, 0.69533), (8, 0.593605), (6, 0.266233), (0, 0.917188), (9, 0.96353), (2, 0.577035), (8, 0.945236), (3, 0.257776), (4, 0.560569), (0, 0.838326), (2, 0.660338), (2, 0.537372), (8, 0.33806), (0, 0.545107), (1, 0.616673), (5, 0.30411), (0, 0.434737), (2, 0.588249), (9, 0.991362), (8, 0.772253), (6, 0.705396), (5, 0.323255), (8, 0.830319), (3, 0.679546), (4, 0.399748), (4, 0.440115), (6, 0.938154), (8, 0.333143), (9, 0.923541), (7, 0.19552), (4, 0.869822), (7, 0.620006), (4, 0.833529), (4, 0.297515), (4, 0.19906), (5, 0.540905), (9, 0.33313), (5, 0.200515), (5, 0.900481), (6, 0.02665), (3, 0.495421), (0, 0.96582), (9, 0.847218);
--- I want approximately (within variation of common percentile methods) the following:
id x
----------
0 0.9658
1 0.7624
2 0.6953
3 0.6795
4 0.8335
5 0.7205
6 0.9023
7 0.677
8 0.9452
9 0.9914
The actual input set has about two million rows, and each actual id
group has a few dozen to a few hundred (or possibly more) rows.
I've explored SO and other sites for solutions, but it seems like the couple dozen or so pages I checked have solutions that are only applicable to computing a percentile over an entire row set rather than each group/partition of a row set. (I'm relatively inexperienced with SQL, so I might have overlooked something.)
I've also looked at the docs for the ranking functions, but I haven't been able to glue together a query that would work.
I'd like to use PERCENTILE_DISC or PERCENTILE_CONT, but I'm stuck with 2008 R2 for now.