I have a table in MySQL with the following fields:
id, company_name, year, state
There are multiple rows for the same customer and year, here is an example of the data:
id | company_name | year | state
----------------------------------------
1 | companyA | 2008 | 1
2 | companyB | 2009 | 2
3 | companyC | 2010 | 3
4 | companyB | 2009 | 1
5 | companyC | NULL | 3
I am trying to create a view from this table to show one company per row (i.e. GROUP BY
pubco_name) where the state is the highest for a given year.
Here is an example of the view I am trying to create:
id | cuompany_name | NULL | 2008 | 2009 | 2010
--------------------------------------------------
1 | companyA | NULL | 1 | NULL | NULL
2 | companyB | NULL | 2 | NULL | NULL
3 | companyC | 3 | NULL | NULL | 3
There is a lot more data than this, but you can see what I am trying to accomplish.
I don't know how to select the max state for each year and group by pubco_name.
Here is the SQL I have thus far (I think we need to use CASE
and/or sub-selects
here):
SELECT
id,
company_name,
SUM(CASE WHEN year = 2008 THEN max(state) ELSE 0 END) AS 2008,
SUM(CASE WHEN year = 2009 THEN max(state) ELSE 0 END) AS 2009,
SUM(CASE WHEN year = 2010 THEN max(state) ELSE 0 END) AS 2010,
SUM(CASE WHEN year = 2011 THEN max(state) ELSE 0 END) AS 2011,
SUM(CASE WHEN year = 2012 THEN max(state) ELSE 0 END) AS 2012,
SUM(CASE WHEN year = 2013 THEN max(state) ELSE 0 END) AS 2013
FROM tbl
GROUP BY company_name
ORDER BY id DESC
Appreciate your help and thanks in advance.
You need to pivot the table but mysql does not have any such functionality of pivot
so we need to replicate its functionality
EDITED
Select
group_concat(
DISTINCT
if(year is null,
CONCAT('max(if (year is null, state, 0)) as ''NULL'' '),
CONCAT('max(if (year=''', year, ''', state, 0)) as ''',year, ''' '))
) into @sql from tbl join (SELECT @sql:='')a;
set @sql = concat('select company_name, ', @sql, 'from tbl group by company_name;');
PREPARE stmt FROM @sql;
EXECUTE stmt;
Result
| COMPANY_NAME | 2008 | 2009 | 2010 | NULL |
--------------------------------------------
| companyA | 1 | 0 | 0 | 0 |
| companyB | 0 | 2 | 0 | 0 |
| companyC | 0 | 0 | 3 | 3 |
SQL FIDDLE
There are 2 approaches to solve your problem
1. create case for each year, which is not possible in your case as we are dealing with year
2. generate the query dynamically so that we get proper columns as per your need.
I have given solution according to the second solution where I am generating the query and storing it in @sql
variable. In the fiddle I have printed the contents of @sql
before executing it.
select company_name, max(if (year='2008', state, 0)) as '2008' ,max(if (year='2009', state, 0)) as '2009' ,max(if (year='2010', state, 0)) as '2010' ,max(if (year is null, state, 0)) as 'NULL' from tbl group by company_name;
For more information regarding group_concat()
go through the link
GROUP_CONCAT and
USER DEFINED VARIABLE
Hope this helps..
Please see the page linked in the answer to this question.
Note that when you do this, you must specify ahead of time how many columns you want in your output.
In response to the comment below, here is a simple/ basic implementation that reproduces the result table above (except for the ID column; having it makes no sense, as each row in the result can summarize more than one row in the input table)
SELECT
`company_name`,
NULLIF(SUM(CASE WHEN `t3`.`year` IS NULL THEN `t3`.`state` ELSE 0 END), 0) AS `null`,
NULLIF(SUM(CASE WHEN `t3`.`year` = 2008 THEN `t3`.`state` ELSE 0 END), 0) AS `2008`,
NULLIF(SUM(CASE WHEN `t3`.`year` = 2009 THEN `t3`.`state` ELSE 0 END), 0) AS `2009`,
NULLIF(SUM(CASE WHEN `t3`.`year` = 2010 THEN `t3`.`state` ELSE 0 END), 0) AS `2010`
FROM
(
SELECT
`t1`.`id`,
`t1`.`company_name`,
`t1`.`year`,
`t1`.`state`
FROM `tbl` `t1`
WHERE `t1`.`state` = (
SELECT MAX(`state`)
FROM `tbl` `t2`
WHERE `t2`.`company_name` = `t1`.`company_name`
AND (`t2`.`year` IS NULL AND `t1`.`year` IS NULL OR `t2`.`year` = `t1`.`year`)
)
) `t3`
GROUP BY `t3`.`company_name`;
This uses nested queries: the inner ones (with the t1
and t2
aliases) find the row with the maximum state for each year and company (and which will break unless you can be sure that this is unique!), and the outer one t3
does the pivot.
I would test this thoroughly to ensure performance is acceptable on real data.