In SQL, how can you “group by” in ranges?

2018-12-31 23:37发布

问题:

Suppose I have a table with a numeric column (lets call it \"score\").

I\'d like to generate a table of counts, that shows how many times scores appeared in each range.

For example:

score range  | number of occurrences
-------------------------------------
   0-9       |        11
  10-19      |        14
  20-29      |         3
   ...       |       ...

In this example there were 11 rows with scores in the range of 0 to 9, 14 rows with scores in the range of 10 to 19, and 3 rows with scores in the range 20-29.

Is there an easy way to set this up? What do you recommend?

回答1:

Neither of the highest voted answers are correct on SQLServer 2000. Perhaps they were using a different version.

Here are the correct versions of both of them on SQLServer 2000.

select t.range as [score range], count(*) as [number of occurences]
from (
  select case  
    when score between 0 and 9 then \' 0- 9\'
    when score between 10 and 19 then \'10-19\'
    else \'20-99\' end as range
  from scores) t
group by t.range

or

select t.range as [score range], count(*) as [number of occurences]
from (
      select user_id,
         case when score >= 0 and score< 10 then \'0-9\'
         when score >= 10 and score< 20 then \'10-19\'
         else \'20-99\' end as range
     from scores) t
group by t.range


回答2:

An alternative approach would involve storing the ranges in a table, instead of embedding them in the query. You would end up with a table, call it Ranges, that looks like this:

LowerLimit   UpperLimit   Range 
0              9          \'0-9\'
10            19          \'10-19\'
20            29          \'20-29\'
30            39          \'30-39\'

And a query that looks like this:

Select
   Range as [Score Range],
   Count(*) as [Number of Occurences]
from
   Ranges r inner join Scores s on s.Score between r.LowerLimit and r.UpperLimit
group by Range

This does mean setting up a table, but it would be easy to maintain when the desired ranges change. No code changes necessary!



回答3:

I see answers here that won\'t work in SQL Server\'s syntax. I would use:

select t.range as [score range], count(*) as [number of occurences]
from (
  select case 
    when score between  0 and  9 then \' 0-9 \'
    when score between 10 and 19 then \'10-19\'
    when score between 20 and 29 then \'20-29\'
    ...
    else \'90-99\' end as range
  from scores) t
group by t.range

EDIT: see comments



回答4:

In postgres (where || is the string concatenation operator):

select (score/10)*10 || \'-\' || (score/10)*10+9 as scorerange, count(*)
from scores
group by score/10
order by 1

gives:

 scorerange | count 
------------+-------
 0-9        |    11
 10-19      |    14
 20-29      |     3
 30-39      |     2


回答5:

James Curran\'s answer was the most concise in my opinion, but the output wasn\'t correct. For SQL Server the simplest statement is as follows:

SELECT 
    [score range] = CAST((Score/10)*10 AS VARCHAR) + \' - \' + CAST((Score/10)*10+9 AS VARCHAR), 
    [number of occurrences] = COUNT(*)
FROM #Scores
GROUP BY Score/10
ORDER BY Score/10

This assumes a #Scores temporary table I used to test it, I just populated 100 rows with random number between 0 and 99.



回答6:

create table scores (
   user_id int,
   score int
)

select t.range as [score range], count(*) as [number of occurences]
from (
      select user_id,
         case when score >= 0 and score < 10 then \'0-9\'
         case when score >= 10 and score < 20 then \'10-19\'
         ...
         else \'90-99\' as range
     from scores) t
group by t.range


回答7:

select cast(score/10 as varchar) + \'-\' + cast(score/10+9 as varchar), 
       count(*)
from scores
group by score/10


回答8:

This will allow you to not have to specify ranges, and should be SQL server agnostic. Math FTW!

SELECT CONCAT(range,\'-\',range+9), COUNT(range)
FROM (
  SELECT 
    score - (score % 10) as range
  FROM scores
)


回答9:

I would do this a little differently so that it scales without having to define every case:

select t.range as [score range], count(*) as [number of occurences]
from (
  select FLOOR(score/10) as range
  from scores) t
group by t.range

Not tested, but you get the idea...



回答10:

declare @RangeWidth int

set @RangeWidth = 10

select
   Floor(Score/@RangeWidth) as LowerBound,
   Floor(Score/@RangeWidth)+@RangeWidth as UpperBound,
   Count(*)
From
   ScoreTable
group by
   Floor(Score/@RangeWidth)


回答11:

select t.blah as [score range], count(*) as [number of occurences]
from (
  select case 
    when score between  0 and  9 then \' 0-9 \'
    when score between 10 and 19 then \'10-19\'
    when score between 20 and 29 then \'20-29\'
    ...
    else \'90-99\' end as blah
  from scores) t
group by t.blah

Make sure you use a word other than \'range\' if you are in MySQL, or you will get an error for running the above example.



回答12:

Because the column being sorted on (Range) is a string, string/word sorting is used instead of numeric sorting.

As long as the strings have zeros to pad out the number lengths the sorting should still be semantically correct:

SELECT t.range AS ScoreRange,
       COUNT(*) AS NumberOfOccurrences
  FROM (SELECT CASE
                    WHEN score BETWEEN 0 AND 9 THEN \'00-09\'
                    WHEN score BETWEEN 10 AND 19 THEN \'10-19\'
                    ELSE \'20-99\'
               END AS Range
          FROM Scores) t
 GROUP BY t.Range

If the range is mixed, simply pad an extra zero:

SELECT t.range AS ScoreRange,
       COUNT(*) AS NumberOfOccurrences
  FROM (SELECT CASE
                    WHEN score BETWEEN 0 AND 9 THEN \'000-009\'
                    WHEN score BETWEEN 10 AND 19 THEN \'010-019\'
                    WHEN score BETWEEN 20 AND 99 THEN \'020-099\'
                    ELSE \'100-999\'
               END AS Range
          FROM Scores) t
 GROUP BY t.Range


回答13:

Try

SELECT (str(range) + \"-\" + str(range + 9) ) AS [Score range], COUNT(score) AS [number of occurances]
FROM (SELECT  score,  int(score / 10 ) * 10  AS range  FROM scoredata )  
GROUP BY range;


回答14:

Perhaps you\'re asking about keeping such things going...

Of course you\'ll invoke a full table scan for the queries and if the table containing the scores that need to be tallied (aggregations) is large you might want a better performing solution, you can create a secondary table and use rules, such as on insert - you might look into it.

Not all RDBMS engines have rules, though!