MySQL Pivot or Excel Solution

2019-08-01 09:50发布

问题:

I have a table:

CREATE TABLE `Issues` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `title` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

I have another table:

CREATE TABLE `Attachments` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `issue_id` int(11) DEFAULT NULL,
  `attachment` text,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

How can I get the data to look like this:

issue_id    title    attachment1    attachment2    attachment3
--------------------------------------------------------------
1           T1       a1.png         a2.png
2           T2
3           T3       b4.gif         xyz.doc        ttt.file

The problem I can't figure out is how to get the dynamic set of attachments into a dynamic column grouped by issue. I have determined that the maximum amount of attachments for one issue is 12, but the total per ticket can be anywhere from 0-12. I'm stumped...

I've tried this MySQL pivot row into dynamic number of columns, but can't make sense of it in my case because I'm building dynamic columns based on total matches per record...

Any help would be greatly appreciate. Please let me know if this doesn't make sense.

Nino

回答1:

Here is one way to accomplish this if you know the max number is 12. It uses MAX and CASE, getting the Row Number for each attachment.

SELECT 
    I.Id issue_id, 
    I.title, 
    MAX(CASE WHEN d.row_number = 1 THEN D.attachment END) attachment1,
    MAX(CASE WHEN d.row_number = 2 THEN D.attachment END) attachment2,
    MAX(CASE WHEN d.row_number = 3 THEN D.attachment END) attachment3,
    MAX(CASE WHEN d.row_number = 4 THEN D.attachment END) attachment4,
    MAX(CASE WHEN d.row_number = 5 THEN D.attachment END) attachment5,
    MAX(CASE WHEN d.row_number = 6 THEN D.attachment END) attachment6,
    MAX(CASE WHEN d.row_number = 7 THEN D.attachment END) attachment7,
    MAX(CASE WHEN d.row_number = 8 THEN D.attachment END) attachment8,
    MAX(CASE WHEN d.row_number = 9 THEN D.attachment END) attachment9,
    MAX(CASE WHEN d.row_number = 10 THEN D.attachment END) attachment10,
    MAX(CASE WHEN d.row_number = 11 THEN D.attachment END) attachment11,
    MAX(CASE WHEN d.row_number = 12 THEN D.attachment END) attachment12
FROM Issues I
  LEFT JOIN (
  SELECT 
      a.issue_id, 
      @running:=if(@previous=a.issue_id,@running,0) + 1 as row_number,
      @previous:=a.issue_id,
      a.attachment
  FROM Attachments a
      JOIN    (SELECT @previous := 0) r
    ORDER BY a.issue_id, a.attachment
  ) D ON I.ID = D.issue_id
GROUP BY I.Id, I.Title

And here is the SQL Fiddle.

I had to make an edit to make the rownumber reset per group. Should be working now. Also, per @spencer7593's great comment, I've updated the query slightly.

--EDIT--

In response to OP's comment about needing dynamic results, this should work:

SET @sql = NULL;

SELECT
  GROUP_CONCAT(DISTINCT
    CONCAT(
      'MAX(IF(d.row_number = ', d.row_number, ',D.attachment,NULL)) AS attachment', d.row_number)
  ) INTO @sql
FROM Issues I
  LEFT JOIN (
  SELECT 
      a.issue_id, 
      @running:=if(@previous=a.issue_id,@running,0) + 1 as row_number,
      @previous:=a.issue_id,
      a.attachment
  FROM Attachments a
      JOIN    (SELECT @previous := 0) r
    ORDER BY a.issue_id, a.attachment
  ) D ON I.ID = D.issue_id
;

SET @sql = CONCAT('SELECT I.Id issue_id, 
                          I.title, ', @sql, ' 
                  FROM Issues I
                  LEFT JOIN (
                  SELECT 
                      a.issue_id, 
                      @running:=if(@previous=a.issue_id,@running,0) + 1 as row_number,
                      @previous:=a.issue_id,
                      a.attachment
                  FROM Attachments a
                      JOIN    (SELECT @previous := 0) r
                    ORDER BY a.issue_id, a.attachment
                  ) D ON I.ID = D.issue_id
                GROUP BY I.Id, I.Title');

PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

And here is the SQL Fiddle.



回答2:

The set of columns returned by a SELECT statement are statically defined. A SELECT statement can't return a "variable" number of columns.

It is possible to produce the resultset you show, if you can define the columns to be returned, which in your case, means defining a maximum number of attachment values (columns) that will be returned on a row.

One approach to getting that resultset is to use a correlated subquery in the SELECT list to return the first, second, third, etc. occurrence of attachment values.

SELECT i.id
     , i.title
     , ( SELECT a1.attachment
           FROM `Attachments` a1
          WHERE a1.issue_id = i.id
          ORDER BY a1.id
          LIMIT 0,1
        ) AS attachment1
     , ( SELECT a2.attachment
           FROM `Attachments` a2
          WHERE a2.issue_id = i.id
          ORDER BY a2.id
          LIMIT 1,1
        ) AS attachment2
     , ( SELECT a3.attachment
           FROM `Attachments` a3
          WHERE a3.issue_id = i.id
          ORDER BY a3.id
          LIMIT 2,1
        ) AS attachment3
  FROM `Issues` i
 ORDER BY i.id

To return your maximum number of attachment, you would need to extend that...

     , ( SELECT a4.attachment
           FROM `Attachments` a4
          WHERE a4.issue_id = i.id
          ORDER BY a4.id
          LIMIT 3,1
        ) AS attachment4

The purpose of the ORDER BY is to make the resultset from the query deterministic (absent the ORDER BY, MySQL can return rows in whatever order it wants.)

The purpose of the LIMIT clause is to specify that only 1 row be returned. LIMIT 0,1 specifies that 1 row is to be returned, starting with the first row (0). LIMIT 1,1 returns only the second row.


This is not the only approach, and may not be the most efficient. It can work reasonably for a small number of rows returned from the outer query (in your case, from the Issues table. The "nested loops" plan that is generated for this statement can be resource intensive (i.e. slow) for a large set.

For best performance, you'll likely want an index...

ON `Attachments` (issue_id, id) 

or at least on

ON `Attachments` (issue_id)

If you truly need a "dynamic" number returned, you're going to be better served returning the attachment values as separate rows, and processing the resultset returned from the SQL statement on the client side.