Compare each column's contents with all other

2019-08-16 20:32发布

Given this table:

enter image description here

I'd like to derive this table:

enter image description here

...sort of like a mileage chart in a map book.

I'm trying to create a cross-table comparison of the words in each of the columns, against all of the other columns' words, to show how many matches there are between them.

For instance, comparing Column 1 against Column2 might yield 4 matches. The yellow, bold outlined cells are the matches.

enter image description here

And here's how I count them:

enter image description here

I'm thinking there might be an 'easy' way to accomplish this using Power Query. Is there?

(Oh...and by the way...the solution I'm looking for should not expect a static number of input columns: i.e., it should accommodate for more columns or less columns to be used in the input comparison set.)

Thanks.

标签: powerquery m
1条回答
再贱就再见
2楼-- · 2019-08-16 20:49

No, there is no easy way, but it can be done. However, I get different results. My interpretation of your logic is: for each column combination, the number of occurrences of each common word in 1 column must be multiplied with the number of occurrences in the other column. These are my results:

enter image description here

And this is my query code:

let
    Source = Table1,
    ColumnNames = Table.ColumnNames(Source),
    Tabled = Table.FromColumns({ColumnNames}, type table[Columns = text]),
    AddedColumns2 = Table.AddColumn(Tabled, "Columns2", each ColumnNames, type {text}),
    ExpandedColumns2 = Table.ExpandListColumn(AddedColumns2, "Columns2"),
    CommonWords = 
        Table.AddColumn(ExpandedColumns2, 
                        "DistinctIntersect", 
                        each if [Columns] = [Columns2]
                           then {} 
                           else List.Distinct(List.Intersect({Table.Column(Source,[Columns]),
                                                              Table.Column(Source,[Columns2])}))),
    AddedCount = 
        Table.AddColumn(CommonWords,
                        "Count", 
                        (This) => List.Sum({0}&List.Transform(This[DistinctIntersect],
                                                   each List.Count(List.PositionOf(Table.Column(Source,This[Columns]),_,2)) *
                                                        List.Count(List.PositionOf(Table.Column(Source,This[Columns2]),_,2)))),
                       Int64.Type),
    RemovedColumns = Table.RemoveColumns(AddedCount,{"DistinctIntersect"}),
    PivotedColumn = Table.Pivot(RemovedColumns, List.Distinct(RemovedColumns[Columns2]), "Columns2", "Count")
in
    PivotedColumn
查看更多
登录 后发表回答