I am working on a legacy database and need to develop a SQL query to give to a customer. As a legacy database, it was not designed with this type of query in mind. I've simplified the two tables I need to select from to make an easier to understand example. I have a "long table", and need to make it "wide". I have tried working with PIVOT
but have encountered two issues:
- There is nothing to aggregate- it's just a simple matrix transformation.
- I don't know the number of columns I need to add much less the actual values of the column headers.
I need a SQL query that will output the results as follows for the given schema below:
| [Id] | [Author] | [PublishedYear] | [Title] |
-------------------------------------------------
| 1 | 'Robert' | '2017' | null |
| 2 | 'Tim' | null | null |
| 3 | null | '2018' | null |
| 4 | null | null | 'Winning' |
SQL to Build Example:
CREATE TABLE [Book] (
[Id] int
);
INSERT INTO [Book] ([Id])
VALUES (1);
INSERT INTO [Book] ([Id])
VALUES (2);
INSERT INTO [Book] ([Id])
VALUES (3);
INSERT INTO [Book] ([Id])
VALUES (4);
CREATE TABLE [BookProperty] (
[Name] VARCHAR(100),
[Value] VARCHAR(100),
[BookId] int
);
INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('Author', 'Robert', 1);
INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('Author', 'Tim', 2);
INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('PublishedYear', '2018', 3);
INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('PublishedYear', '2017', 1);
INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('Title', 'Winning', 4);
You can try to use Condition Aggregate function, MAX
with CASE WHEN
and GROUP BY
CREATE TABLE [Book] (
[Id] int
);
INSERT INTO [Book] ([Id])
VALUES (1);
INSERT INTO [Book] ([Id])
VALUES (2);
INSERT INTO [Book] ([Id])
VALUES (3);
INSERT INTO [Book] ([Id])
VALUES (4);
CREATE TABLE [BookProperty] (
[Name] VARCHAR(100),
[Value] VARCHAR(100),
[BookId] int
);
INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('Author', 'Robert', 1);
INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('Author', 'Tim', 2);
INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('PublishedYear', '2018', 3);
INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('PublishedYear', '2017', 1);
INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('Title', 'Winning', 4);
Query 1:
SELECT id,
MAX(CASE WHEN Name = 'Author' THEN Value END) as 'Author',
MAX(CASE WHEN Name = 'PublishedYear' THEN Value END) as 'PublishedYear',
MAX(CASE WHEN Name = 'Title' THEN Value END) as 'Title'
FROM [Book] b INNER JOIN [BookProperty] bp
on b.Id = bp.BookId
GROUP BY id
Results:
| id | Author | PublishedYear | Title |
|----|--------|---------------|---------|
| 1 | Robert | 2017 | (null) |
| 2 | Tim | (null) | (null) |
| 3 | (null) | 2018 | (null) |
| 4 | (null) | (null) | Winning |
EDIT
You can try to use dynamic pivot to make you expect.
use STUFF
function dynamic create Condition Aggregate function execute statement then use execute dynamic excute your sql.
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX);
SET @cols = STUFF((SELECT DISTINCT ',MAX(CASE WHEN Name = ''' + Name +''' THEN [Value] END) as ''' + Name + ''' '
FROM [Book] b INNER JOIN [BookProperty] bp
on b.Id = bp.BookId
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT ID, '+ @cols + ' FROM [Book] b INNER JOIN [BookProperty] bp on b.Id = bp.BookId GROUP BY id'
execute(@query)
The challenge is that you have an EAV and are trying to pivot the values. This is one of the ugly things of the design you have to work with. You are going to need dynamic sql. I would still use conditional aggregation (aka cross tab) but you need to make a dynamic version. Here is an example that isn't too very different from what you are wrestling with that I wrote a couple years ago. You should be able to tweak this to your dataset pretty easily. If you need help modifying this let me know and I can help out.
if OBJECT_ID('tempdb..#Something') is not null
drop table #Something
if OBJECT_ID('tempdb..#ColumnNames') is not null
drop table #ColumnNames
create table #Something
(
REQUEST_ID int
, ITEM_ID int
, ErrorType varchar(50)
)
insert #Something values
(6019, 5054257, 'Under construction')
, (6024, 5054712, 'KSCV417W')
, (6024, 5054713, 'Under construction')
, (6024, 5054715, 'Under construction')
, (6029, 5164288, 'KSAC680E')
, (6029, 5164289, 'KSAC680E')
, (6029, 5164290, 'KSAC680E')
, (6029, 5164292, 'KSAC680E')
create table #ColumnNames --we need some way to get the Column Names in a table so we can join to this to generate the dynamic columns.
(
ColNum int identity
, ColName sysname
)
insert #ColumnNames
select distinct ErrorType
from #Something
order by ErrorType desc
declare @StaticPortion nvarchar(2000) =
'with OrderedResults as
(
select REQUEST_ID
, ErrorType
, GroupCount = count(*)
, ColNum
from #Something s
join #ColumnNames cn on cn.ColName = s.ErrorType
group by REQUEST_ID
, ErrorType
, ColNum
)
select REQUEST_ID';
declare @DynamicPortion nvarchar(max) = '';
with E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select @DynamicPortion = @DynamicPortion +
', MAX(Case when ColNum = ' + CAST(N as varchar(6)) + ' then convert(varchar(4), GroupCount) end) as [' + cn.ColName + ']'
from cteTally t
join #ColumnNames cn on cn.ColNum = t.N
where t.N <=
(
select top 1 Count(*)
from #Something
group by REQUEST_ID
order by COUNT(*) desc
)
declare @FinalStaticPortion nvarchar(2000) = ' from OrderedResults Group by REQUEST_ID order by REQUEST_ID';
declare @SqlToExecute nvarchar(max) = @StaticPortion + @DynamicPortion + @FinalStaticPortion;
--select @SqlToExecute --You can use this to help debug the dynamic sql
exec sp_executesql @SqlToExecute