I am trying to summarize the count based on the all possible combinations of variables. Here is an example data:
相关问题
- SQL join to get the cartesian product of 2 columns
- sql execution latency when assign to a variable
- Difference between Types.INTEGER and Types.NULL in
- R - Quantstart: Testing Strategy on Multiple Equit
- What is the best way to cache a table from a (SQL)
For this sort of query using some of the built in aggregate tools is quite straight forward.
First off setup some sample data based on your sample image:
Since you want the count of IDs for each possible combination of non zero attributes A, B, and C, the first step is eliminate the zeros and convert the non zero values to a single value we can summarize on, in this case I'll use the attributes name. After that it's a simple matter of performing the aggregate, using the
CUBE
clause in the group by statement to generate the combinations. Lastly in the having clause prune out the unwanted summations. Mostly that's just ignoring the null values in the attributes, and optionally removing the grand summary (count of all rows)Here are the results:
And finally my original rextester link: http://rextester.com/YRJ10544
@lad2025 Here's a dynamic version (sorry my SQL Server skills aren't as strong as my Oracle skills, but it works). Just set the correct values for @Table and @col and it should work as long as all other columns are numeric attributes:
Rextester
Poshan:
As Robert stated, SUMMARY can be used to count combinations. A second SUMMARY can count the computed types. One difficulty is ignoring the combinations that involve a zero value. If they can be converted to missings the processing is much cleaner. Presuming zeros converted to missing, this code would count distinct combinations:
You can see how the use of a CLASS variable in a combination determines the TYPE, and the class variables can be of mixed type (numeric, character)
A different 'home-grown' approach that does not use SUMMARY can use data step with LEXCOMB to compute each combination and SQL with into / separated to generate a SQL statement that will count each distinctly.
Note: The following code contains macro varListEval for resolving a SAS variable list to individual variable names.
Naive approach
SQL Server
version (I've assumed that we always have 3 columns so there will be 2^3-1 rows):Rextester Demo
EDIT:
Same as above but more concise:
Rextester Demo
EDIT 2
Using
UNPIVOT
:Rextester Demo
EDIT FINAL APPROACH
SQL is a bit clumsy to do this kind of operation, but I want to show it is possible.
DBFiddle Demo
DBFiddle Demo with 4 variables