Moving one column to the end of another

2019-06-09 13:32发布

问题:

I have a bunch of excel files that are formatted in a pretty weird way that I would like to make an automated import script for, so I can easily get this data into a sheet with the correct formatting.

It has hourly values for each month in the first 12 columns, then the date along with hours in the next 12 columns.

What I would like to do is be able to get this data in to a table where the first column has the date and hour (in excel format) and the second one containing the data. My thought was to record a macro during the process of adjusting the data with power query and then repeating the macro with multiple files. However, I can't seem to find a good way to make "Column 2"'s data moved to the end of "Column 1" and repeating that for both values and the dates using Power Query. Any pointers?

Also notice, column 1s length differs from column 2 since January has more values than February. However, column 1s lenght is the same as column 13, column 2s length same as 14 and so on.

I have uploaded a sample file here

回答1:

Create a blank query from scratch (on my machine I do this in Excel via: Data > Get Data > From Other Sources > Blank Query).

Click Home > Advanced Editor, copy-paste the code below and change this line folderPath = "C:\Users\user\", to the path of the parent folder that contains the Excel files. Then click Close & Load.

As the data is being imported from multiple workbooks (and possibly multiple sheets), the first two columns of the loaded table should be the workbook and worksheet that that row of data came from. (If you want to get rid of the first two columns, edit the query.)

let
    folderPath = "C:\Users\user\",
    getDataFromSheet = (sheetData as table) =>
        let 
            promoteHeaders = Table.PromoteHeaders(sheetData, [PromoteAllScalars=true]),
            standardiseHeaders =
                let
                    headers = Table.ColumnNames(promoteHeaders),
                    zipWithLowercase = List.Zip({headers, List.Transform(headers, Text.Lower)}),
                    renameAsLowercase = Table.RenameColumns(promoteHeaders, zipWithLowercase)
                in
                    renameAsLowercase,
            emptyTable = Table.FromColumns({{},{}}, {"Date", "Value"}),
            monthsToLoopOver = {"januari", "februari", "mars", "april", "maj", "juni", "juli", "augusti", "september", "oktober", "november", "december"},    
            appendEachMonth = List.Accumulate(monthsToLoopOver, emptyTable, (tableState, currentMonth) =>
                let
                    selectColumns = Table.SelectColumns(standardiseHeaders, {currentMonth & "tim", currentMonth}, MissingField.UseNull),
                    renameColumns = Table.RenameColumns(selectColumns, {{currentMonth & "tim", "Date"}, {currentMonth, "Value"}}),
                    appendToTable = Table.Combine({tableState, renameColumns})
                in
                    appendToTable
                ),
            tableOrNull = if List.Contains(Table.ColumnNames(standardiseHeaders), "januari") then appendEachMonth else null
        in
            tableOrNull,
    getDataFromWorkbook = (filePath as text) =>
        let
            workbookContents = Excel.Workbook(File.Contents(filePath)),
            sheetsOnly = Table.SelectRows(workbookContents, each [Kind] = "Sheet"),
            invokeFunction = Table.AddColumn(sheetsOnly, "f", each getDataFromSheet([Data]), type table),
            appendAndExpand =
                let
                    selectColumnsAndRows = Table.SelectColumns(Table.SelectRows(invokeFunction, each not ([f] is null)), {"Name", "f"}),
                    renameColumns = Table.RenameColumns(selectColumnsAndRows, {{"Name", "Sheet"}}),
                    expandColumn = Table.ExpandTableColumn(renameColumns, "f", {"Date", "Value"})
                in
                    expandColumn
        in
            appendAndExpand,
    filesInFolder = Folder.Files(folderPath),
    validFilesOnly = Table.SelectRows(filesInFolder, each [Extension] = ".xlsx"),
    invokeFunction = Table.AddColumn(validFilesOnly, "f", each getDataFromWorkbook([Folder Path] & [Name])),
    appendAndExpand = 
        let
            selectRowsAndColumns = Table.SelectColumns(Table.SelectRows(invokeFunction, each not ([f] is null)), {"Name", "f"}),
            renameColumns = Table.RenameColumns(selectRowsAndColumns, {{"Name", "Workbook"}}),
            expandColumn = Table.ExpandTableColumn(renameColumns, "f", {"Sheet", "Date", "Value"})
        in
            expandColumn,
    excludeBlankDates = Table.SelectRows(appendAndExpand, each not (Text.StartsWith([Date], " "))),
    transformTypes =
        let
            dateAndHour = Table.TransformColumns(excludeBlankDates, {{"Date", each Text.Split(_, " ")}}),
            changeTypes = Table.TransformColumns(dateAndHour, {{"Workbook", Text.From, type text}, {"Sheet", Text.From, type text}, {"Date", each DateTime.From(_{0}) + #duration(0, Number.From(_{1}), 0, 0), type datetime}, {"Value", Number.From, type number}})
        in
            changeTypes
in
    transformTypes
  • For reliability and robustness, it would be good if you create a folder and put all Excel files (that need restructuring) into that folder -- and ensure nothing else goes into that folder (not even the file that will be doing the importing/restructuring).

  • If you can't do this for whatever reason, then click the validFilesOnly step whilst in the Query Editor and amend the filter criteria, such that the table only includes files you want restructured.



回答2:

You can do the formatting pretty quickly by writing your own macro in VBA. Then you can create another macro to run the formatting macro on multiple files within a folder.

Run same excel macro on multiple excel files

Here is an example of a macro that will reformat your data closer to what you are looking for.

Sub FormatBlad()
    ' create a new sheet and rename it
    Sheets.Add After:=ActiveSheet
    Sheets(Sheets.Count).Name = "Formatted"

    ' set helper variables
    Dim blad As Worksheet
    Dim format As Worksheet
    Set blad = Sheets("Blad1")
    Set ft = Sheets("Formatted")

    Dim blad_row_num As Integer
    Dim ft_row_num As Integer
    Dim month_offset As Integer
    blad_row_num = 2
    ft_row_num = 2
    month_offset = 13 ' column N - 1

    ' set column headers in formatted sheet
    ft.Range("A1").Value = "Date"
    ft.Range("B1").Value = "Value"

    ' loop through months
    For i = 1 To 12
        blad_row_num = 2
        While blad.Cells(blad_row_num, i).Value <> ""
            ft.Cells(ft_row_num, 1).Value = blad.Cells(blad_row_num, month_offset + i).Value
            ft.Cells(ft_row_num, 2).Value = blad.Cells(blad_row_num, i).Value
            blad_row_num = blad_row_num + 1
            ft_row_num = ft_row_num + 1
        Wend
    Next i
End Sub