I have a large, messy report to write that joins across 5 tables. There is one column in one table that is being used for several different values--essentially a "tag" column, where the tags are used in creative ways depending on what kind of miscellaneous metadata the users want to use.
As a result, my query for the report returns 3 nearly-identical rows that differ only in the "tag" column; for instance, I might get:
NAME TAG EMAIL
BOB A BOB@EXAMPLE.COM
BOB B BOB@EXAMPLE.COM
BOB C BOB@EXAMPLE.COM
What I'd like to do is split the contents of the TAG column to be returned as 3 separate columns from the query, like this:
NAME A B C EMAIL
BOB A B C BOB@EXAMPLE.COM
So I tried using the SQL SERVER CASE/WHEN functionality to do this; I'm saying, for instance, when the value of the Tag column is "A", return it in column "A"; if it's "B," put it in "B"; etc. I thought this would return the above, but instead it gives me this:
NAME A B C EMAIL
BOB A NULL NULL BOB@EXAMPLE.COM
BOB NULL B NULL BOB@EXAMPLE.COM
BOB NULL NULL C BOB@EXAMPLE.COM
Which is clearly less than ideal.
Any thoughts, geniuses of Stack Overflow?
try this:
EDIT: to explain...
Anytime you use a
Group By
, you are telling the query processor that you want it to aggregate the results into "buckets", where each bucket is defined by unique values in the columns [or expressions] defined in the Group By clause. That means that the final result set will have one and only one row for each unique set of values in those columns [or expressions] defined in the Group By. All other columns or expressions used in the query, (other than those defined in the Group By), must be an expression based on an aggregation function (like Count(), Sum(), Avg(), Min(), Max(), etc...), that produce a value based on a calculation which will be applied to all the rows in the pre-aggregated result set. If, for example, I were toGroup By
the first character of the last name:then I will get at most 26 rows in the output (one for each letter in the alphabet) and all other columns in my output must be based on some aggregation function to be applied to all the rows in the original set where the last name starts with an 'A", then all the rows where the last name starts with qa 'B' etc..
In your problem, the Group By is used simply to restrict the output set to one row per distinct user and email. Once this is done, there will be only one row per The other columns in the Select statement need to have a
Min()
, [Max()
would work just as well), only in order to satisfy the syntacical requirement mentioned in bold italics above.. In your case, there will only be one non-null row in the set so taking the Min(), or the Max() is necessary only because of the syntax requirement...You need to
pivot
the data.And the result of running the above query is this: