How do I use SQL to PIVOT a table without an aggre

2019-09-19 20:39发布

问题:

I am working on a legacy database and need to develop a SQL query to give to a customer. As a legacy database, it was not designed with this type of query in mind. I've simplified the two tables I need to select from to make an easier to understand example. I have a "long table", and need to make it "wide". I have tried working with PIVOT but have encountered two issues:

  1. There is nothing to aggregate- it's just a simple matrix transformation.
  2. I don't know the number of columns I need to add much less the actual values of the column headers.

I need a SQL query that will output the results as follows for the given schema below:

| [Id] | [Author] | [PublishedYear] | [Title]   |
-------------------------------------------------
| 1    | 'Robert' | '2017'          | null      |
| 2    | 'Tim'    | null            | null      |
| 3    | null     | '2018'          | null      |
| 4    | null     | null            | 'Winning' |

SQL to Build Example:

CREATE TABLE [Book] (
    [Id] int
);

INSERT INTO [Book] ([Id])
VALUES (1);

INSERT INTO [Book] ([Id])
VALUES (2);

INSERT INTO [Book] ([Id])
VALUES (3);

INSERT INTO [Book] ([Id])
VALUES (4);

CREATE TABLE [BookProperty] (
    [Name] VARCHAR(100),
    [Value] VARCHAR(100),
    [BookId] int
);

INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('Author', 'Robert', 1);

INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('Author', 'Tim', 2);

INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('PublishedYear', '2018', 3);

INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('PublishedYear', '2017', 1);

INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('Title', 'Winning', 4);

回答1:

You can try to use Condition Aggregate function, MAX with CASE WHEN and GROUP BY

CREATE TABLE [Book] (
    [Id] int
);

INSERT INTO [Book] ([Id])
VALUES (1);

INSERT INTO [Book] ([Id])
VALUES (2);

INSERT INTO [Book] ([Id])
VALUES (3);

INSERT INTO [Book] ([Id])
VALUES (4);

CREATE TABLE [BookProperty] (
    [Name] VARCHAR(100),
    [Value] VARCHAR(100),
    [BookId] int
);

INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('Author', 'Robert', 1);

INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('Author', 'Tim', 2);

INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('PublishedYear', '2018', 3);

INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('PublishedYear', '2017', 1);

INSERT INTO [BookProperty] ([Name], [Value], [bookId])
VALUES ('Title', 'Winning', 4);

Query 1:

SELECT id,
       MAX(CASE WHEN Name = 'Author' THEN Value END) as 'Author',
       MAX(CASE WHEN Name = 'PublishedYear' THEN Value END) as 'PublishedYear',
       MAX(CASE WHEN Name = 'Title' THEN Value END) as 'Title'
FROM [Book] b INNER JOIN [BookProperty] bp
on b.Id = bp.BookId
GROUP BY id

Results:

| id | Author | PublishedYear |   Title |
|----|--------|---------------|---------|
|  1 | Robert |          2017 |  (null) |
|  2 |    Tim |        (null) |  (null) |
|  3 | (null) |          2018 |  (null) |
|  4 | (null) |        (null) | Winning |

EDIT

You can try to use dynamic pivot to make you expect.

use STUFF function dynamic create Condition Aggregate function execute statement then use execute dynamic excute your sql.

DECLARE @cols AS NVARCHAR(MAX),
        @query  AS NVARCHAR(MAX);



SET @cols = STUFF((SELECT DISTINCT ',MAX(CASE WHEN Name = '''  + Name  +''' THEN [Value] END) as ''' + Name + ''' '
             FROM [Book] b INNER JOIN [BookProperty] bp
             on b.Id = bp.BookId 
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')



set @query = 'SELECT ID, '+  @cols  + ' FROM [Book] b INNER JOIN [BookProperty] bp on b.Id = bp.BookId GROUP BY id'

execute(@query)


回答2:

The challenge is that you have an EAV and are trying to pivot the values. This is one of the ugly things of the design you have to work with. You are going to need dynamic sql. I would still use conditional aggregation (aka cross tab) but you need to make a dynamic version. Here is an example that isn't too very different from what you are wrestling with that I wrote a couple years ago. You should be able to tweak this to your dataset pretty easily. If you need help modifying this let me know and I can help out.

if OBJECT_ID('tempdb..#Something') is not null
    drop table #Something

if OBJECT_ID('tempdb..#ColumnNames') is not null
    drop table #ColumnNames

create table #Something
(
    REQUEST_ID int
    , ITEM_ID int
    , ErrorType varchar(50)
)

insert #Something values
(6019, 5054257, 'Under construction')
, (6024, 5054712, 'KSCV417W')
, (6024, 5054713, 'Under construction')
, (6024, 5054715, 'Under construction')
, (6029, 5164288, 'KSAC680E')
, (6029, 5164289, 'KSAC680E')
, (6029, 5164290, 'KSAC680E')
, (6029, 5164292, 'KSAC680E')

create table #ColumnNames --we need some way to get the Column Names in a table so we can join to this to generate the dynamic columns.
(
    ColNum int identity
    , ColName sysname
)

insert #ColumnNames
select distinct ErrorType
from #Something
order by ErrorType desc

declare @StaticPortion nvarchar(2000) = 
'with OrderedResults as
(
    select REQUEST_ID
        , ErrorType
        , GroupCount = count(*)
        , ColNum
    from #Something s
    join #ColumnNames cn on cn.ColName = s.ErrorType
    group by REQUEST_ID
        , ErrorType
        , ColNum
)
select REQUEST_ID';

declare @DynamicPortion nvarchar(max) = '';


with E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS 
(
    SELECT  ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)

select @DynamicPortion = @DynamicPortion + 
    ', MAX(Case when ColNum = ' + CAST(N as varchar(6)) + ' then convert(varchar(4), GroupCount) end) as [' + cn.ColName + ']'
from cteTally t
join #ColumnNames cn on cn.ColNum = t.N
where t.N <= 
(
    select top 1 Count(*)
    from #Something
    group by REQUEST_ID
    order by COUNT(*) desc
)

declare @FinalStaticPortion nvarchar(2000) = ' from OrderedResults Group by REQUEST_ID order by REQUEST_ID';

declare @SqlToExecute nvarchar(max) = @StaticPortion + @DynamicPortion + @FinalStaticPortion;

--select @SqlToExecute --You can use this to help debug the dynamic sql

exec sp_executesql @SqlToExecute