Excel PowerQuery: How to unpivot or transpose a hu

2020-03-31 08:44发布

问题:

I have this table that looks similar to this:

And I want to transform it so that it looks like this:

The idea is to unpivot( or transpose) the table so that it can be feed into other BI tools, and be readable for analysis.

I have around 20 tables like this with 100+ cols, so of course to do it manually is nearly impossible.

How do I get this done with PowerQuery? I have tried using unpivot feature, but I am stuck as it displayed NYC1, NYC2, etc. VBA, macros don't work also. Any other suggestions are appreciated, but I am at my wits end now. Help!

回答1:

Before loading into PowerQuery, concatenate the headers to the empty row after [Programe Name] using a delimiter (space). You can use the TEXTJOIN function to do this if you use office365. The result looks something like this (I did not copy all your data):

Import this range into PowerQuery and perform the following steps (Do not check the my table has headers checkbox)

  1. Remove top 2 rows (Home tab > Remove Rows)
  2. Use first row as headers (Home tab > Use First Row as Headers)
  3. Select the first column
  4. Unpivot other columns (dropdown menu unpivot columns on Transform Tab)
  5. Split the [Attribute] column by delimiter (space) (Home tab > Split Column)
  6. Change column names
  7. Move City column to the left (rightclick column > move > left)

The script looks like this:

let
    Source = Excel.CurrentWorkbook(){[Name="table"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type any}, {"Column3", type any}, {"Column4", type any}, {"Column5", type any}, {"Column6", type any}, {"Column7", type any}, {"Column8", type any}, {"Column9", type any}}),
    #"Removed Top Rows" = Table.Skip(#"Changed Type",2),
    #"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
    #"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"Program Name", type text}, {"NY Budget", Int64.Type}, {"NY Revenue", Int64.Type}, {"NY Cost", Int64.Type}, {"NY Margin", Int64.Type}, {"LA Budget", Int64.Type}, {"LA Revenue", Int64.Type}, {"LA Cost", Int64.Type}, {"LA Margin", Int64.Type}}),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Program Name"}, "Attribute", "Value"),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Attribute.1", "Attribute.2"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Attribute.1", type text}, {"Attribute.2", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type2",{{"Attribute.1", "City"}, {"Attribute.2", "Description"}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns",{"City", "Program Name", "Description", "Value"})
in
    #"Reordered Columns"

And this is the result (in the Power Query Editor)



回答2:

Here's a pretty generic depivot approach which handles multiple row/column headers.

Select a cell in the source table before running (note - this uses CurrentRegion so will fail if your table has completely blank rows or columns).

Sub UnpivotIt()

    Dim numRowHeaders As Long, numColHeaders As Long
    Dim numRows As Long, numCols As Long, rng As Range
    Dim rngOut As Range, r As Long, c As Long, i As Long, n As Long
    Dim arrIn, arrOut, outRow As Long

    arrIn = Selection.CurrentRegion.Value

    numRowHeaders = Application.InputBox("How many header rows?", Type:=1)
    numColHeaders = Application.InputBox("How many header columns?", Type:=1)
    Set rngOut = Application.InputBox("Select output (top-left cell)", Type:=8)

    Set rngOut = rngOut.Cells(1) 'in case >1 cells selected

    numRows = UBound(arrIn, 1)
    numCols = UBound(arrIn, 2)

    ReDim arrOut(1 To ((numRows - numRowHeaders) * (numCols - numColHeaders)), _
                 1 To (numRowHeaders + numColHeaders + 1))

    outRow = 0
    For r = (numRowHeaders + 1) To numRows
    For c = (numColHeaders + 1) To numCols
        'only copy if there's a value
        If Len(arrIn(r, c)) > 0 Then
            outRow = outRow + 1
            i = 1
            For n = 1 To numColHeaders 'copy column headers
                arrOut(outRow, i) = arrIn(r, n)
                i = i + 1
            Next n
            For n = 1 To numRowHeaders '...row headers
                arrOut(outRow, i) = arrIn(n, c)
                i = i + 1
            Next n
            arrOut(outRow, i) = arrIn(r, c) '...and the value
        End If
    Next c
    Next r

    rngOut.Resize(outRow, UBound(arrOut, 2)).Value = arrOut

End Sub