Create a Summary View in MySQL by pivoting row int

2019-05-08 06:18发布

问题:

I have a table in MySQL with the following fields:

id, company_name, year, state

There are multiple rows for the same customer and year, here is an example of the data:

    id | company_name  | year | state
----------------------------------------
    1  | companyA      | 2008 | 1
    2  | companyB      | 2009 | 2
    3  | companyC      | 2010 | 3
    4  | companyB      | 2009 | 1
    5  | companyC      | NULL | 3

I am trying to create a view from this table to show one company per row (i.e. GROUP BY pubco_name) where the state is the highest for a given year.

Here is an example of the view I am trying to create:

    id | cuompany_name | NULL | 2008 | 2009 | 2010
--------------------------------------------------
    1  | companyA      | NULL | 1    | NULL | NULL
    2  | companyB      | NULL | 2    | NULL | NULL
    3  | companyC      | 3    | NULL | NULL | 3

There is a lot more data than this, but you can see what I am trying to accomplish.

I don't know how to select the max state for each year and group by pubco_name. Here is the SQL I have thus far (I think we need to use CASE and/or sub-selects here):

SELECT
id,
company_name,
SUM(CASE WHEN year = 2008 THEN max(state) ELSE 0 END) AS 2008,
SUM(CASE WHEN year = 2009 THEN max(state) ELSE 0 END) AS 2009,
SUM(CASE WHEN year = 2010 THEN max(state) ELSE 0 END) AS 2010,
SUM(CASE WHEN year = 2011 THEN max(state) ELSE 0 END) AS 2011,
SUM(CASE WHEN year = 2012 THEN max(state) ELSE 0 END) AS 2012,
SUM(CASE WHEN year = 2013 THEN max(state) ELSE 0 END) AS 2013
FROM tbl
GROUP BY company_name
ORDER BY id DESC

Appreciate your help and thanks in advance.

回答1:

You need to pivot the table but mysql does not have any such functionality of pivot

so we need to replicate its functionality

EDITED

Select 
  group_concat(
    DISTINCT 
       if(year is null,
          CONCAT('max(if (year is null, state, 0)) as ''NULL'' '),
          CONCAT('max(if (year=''', year, ''', state, 0)) as ''',year, ''' '))
    ) into @sql from tbl join (SELECT @sql:='')a;
set @sql = concat('select company_name, ', @sql, 'from tbl group by company_name;');
PREPARE stmt FROM @sql;
EXECUTE stmt;

Result

| COMPANY_NAME | 2008 | 2009 | 2010 | NULL |
--------------------------------------------
|     companyA |    1 |    0 |    0 |    0 |
|     companyB |    0 |    2 |    0 |    0 |
|     companyC |    0 |    0 |    3 |    3 |

SQL FIDDLE

There are 2 approaches to solve your problem 1. create case for each year, which is not possible in your case as we are dealing with year 2. generate the query dynamically so that we get proper columns as per your need.

I have given solution according to the second solution where I am generating the query and storing it in @sql variable. In the fiddle I have printed the contents of @sql before executing it.

select company_name, max(if (year='2008', state, 0)) as '2008' ,max(if (year='2009', state, 0)) as '2009' ,max(if (year='2010', state, 0)) as '2010' ,max(if (year is null, state, 0)) as 'NULL' from tbl group by company_name; 

For more information regarding group_concat() go through the link GROUP_CONCAT and USER DEFINED VARIABLE

Hope this helps..



回答2:

Please see the page linked in the answer to this question.

Note that when you do this, you must specify ahead of time how many columns you want in your output.

In response to the comment below, here is a simple/ basic implementation that reproduces the result table above (except for the ID column; having it makes no sense, as each row in the result can summarize more than one row in the input table)

SELECT
   `company_name`,
   NULLIF(SUM(CASE WHEN `t3`.`year` IS NULL THEN `t3`.`state` ELSE 0 END), 0) AS `null`,
   NULLIF(SUM(CASE WHEN `t3`.`year` = 2008 THEN `t3`.`state` ELSE 0 END), 0) AS `2008`,
   NULLIF(SUM(CASE WHEN `t3`.`year` = 2009 THEN `t3`.`state` ELSE 0 END), 0) AS `2009`,
   NULLIF(SUM(CASE WHEN `t3`.`year` = 2010 THEN `t3`.`state` ELSE 0 END), 0) AS `2010`
FROM
(
   SELECT
   `t1`.`id`,
   `t1`.`company_name`,
   `t1`.`year`,
   `t1`.`state`
   FROM `tbl` `t1`
   WHERE `t1`.`state` = (
      SELECT MAX(`state`)
      FROM `tbl` `t2`
      WHERE `t2`.`company_name` = `t1`.`company_name`
      AND (`t2`.`year` IS NULL AND `t1`.`year` IS NULL OR `t2`.`year` = `t1`.`year`)
   )
) `t3`
GROUP BY `t3`.`company_name`;

This uses nested queries: the inner ones (with the t1 and t2 aliases) find the row with the maximum state for each year and company (and which will break unless you can be sure that this is unique!), and the outer one t3 does the pivot.

I would test this thoroughly to ensure performance is acceptable on real data.