SQL Return All Combinations of Two Columns and The

2019-08-30 04:08发布

问题:

I'm trying to do something in SQL that maybe I can't do. I have a single table with this type of data (synthetic data shown):

columnn1 | column2 | ...
------------------------
    A    |    1    |
    B    |    1    |
    C    |    2    |
    A    |    2    |
    D    |    3    |
    A    |    1    |
    B    |    1    |

What I'd like to get back is something that both:

  1. Computes all combinations (e.g., cross join), and
  2. Also includes the number of times the combinations occur.

I know how to do the combinations with a simple CROSS JOIN. I can also get the number of combinations with a simple GROUP BY clause. However, is there a way to efficiently combine these into a single query? What I want to try to avoid is generating and saving an intermediate table. The output I'd like looks like this:

columnn1 | column2 | count
---------------------------
    A    |    1    |   2
    A    |    2    |   1
    A    |    3    |   0 (or null)
    B    |    1    |   2
    B    |    2    |   0 (or null)
    B    |    3    |   0 (or null)
    C    |    1    |   0 (or null)
    C    |    2    |   1
    C    |    3    |   0 (or null)
    D    |    1    |   0 (or null)
    D    |    2    |   0 (or null)
    D    |    3    |   1

I don't really care if the counts are zero (preferred) or nullls. I also don't care about the sorting order of the columns. I've done some searching and can't seem to find a way to do this without generating and saving an intermediate table. I'm probably just overlooking something silly though. Thanks in advance!

回答1:

Assuming none of the values in the table are NULL, you can take the following strategy. Use cross join to get all the combinations of the two columns -- even combinations that are not in the data.

Then use left join and group by to get the counts you are looking for:

select c1.col1, c2.col2, count(t.col1)
from (select distinct col1 from table) c1 cross join
     (select distinct col2 from table) c2 left join
     table t
     on t.col1 = c1.col1 and t.col2 = c2.col2
group by c1.col1, c2.col2;

Handling NULL values is pretty easy, but it requires a bit more logic.