compare array of hashes and print expected & actua

2019-07-15 04:20发布

I have 2 array of hashes:

actual = [{"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"},
 {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"},
 {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"},
 {"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}]
expected = [{"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"},
 {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"NUMBER"},
 {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"NUMBER"},
 {"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}]

I need to compare these 2 hashes and find out the ones for which the column_data_type differs.

to compare we can directly use:

diff = actual -   expected

This will print the output as:

{"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"}
{"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"}

My expected output is that in the result i want to print the actual and expected datatype, means the datatypes for the missing `column_name' from both the actual and expected array of hashes, something like:

{"column_name"=>"NONINTERESTEXPENSE", "expected_column_data_type"=>"NUMBER", "actual_column_data_type" => "VARCHAR"}
{"column_name"=>"TRANSACTIONDATE", "expected_column_data_type"=>"NUMBER","actual_column_data_type" => "TIMESTAMP" }

标签: ruby hash
5条回答
唯我独甜
2楼-- · 2019-07-15 04:34

This will work irrespective of order of hashes in your array.

diff = []

expected.each do |elem|
  column_name = elem['column_name']
  column_type = elem['column_data_type']
  match = actual.detect { |elem2| elem2['column_name'] == column_name  }
  if column_type != match['column_data_type']
    diff << { 'column_name' => column_name,
              'expected_column_data_type' => column_type,
              'actual_column_data_type' => match['column_data_type'] }
  end
end

p diff
查看更多
Ridiculous、
3楼-- · 2019-07-15 04:40
[actual, expected].map { |a| a.map(&:dup).map(&:values) }
                  .map(&Hash.method(:[]))
                  .reduce do |actual, expected|
                    actual.merge(expected) do |k, o, n|
                      o == n ? nil : {name: k, actual: o, expected: n}
                    end
                  end.values.compact

#⇒ [
#    [0] {
#            :name => "NONINTERESTEXPENSE",
#          :actual => "VARCHAR",
#        :expected => "NUMBER"
#    },
#    [1] {
#            :name => "TRANSACTIONDATE",
#          :actual => "TIMESTAMP",
#        :expected => "NUMBER"
#    }
# ]

The method above easily expandable to merge N arrays (use reduce.with_index and merge with key "value_from_#{idx}".)

查看更多
SAY GOODBYE
4楼-- · 2019-07-15 04:43

Code

def convert(actual, expected)
  hashify(actual-expected, "actual_data_type").
  merge(hashify(expected-actual, "expected_data_type")) { |_,a,e| a.merge(e) }.values
end

def hashify(arr, key)
  arr.each_with_object({}) { |g,h| h[g["column_name"]] =
    { "column_name"=>g["column_name"], key=>g["column_data_type"] } }
end

Example

actual = [
  {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"},
  {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"},
  {"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"},
  {"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}
]

expected = [
  {"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"},
  {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"NUMBER"},
  {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"NUMBER"},
  {"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}
]

convert(actual, expected)
  #=> [{"column_name"=>"TRANSACTIONDATE",
  #     "actual_data_type"=>"TIMESTAMP", "expected_data_type"=>"NUMBER"},
  #    {"column_name"=>"NONINTERESTEXPENSE",
  #     "actual_data_type"=>"VARCHAR", "expected_data_type"=>"NUMBER"}] 

Explanation

For the example above the steps are as follows.

First hashify actual and expected.

f = actual-expected
  #=> [{"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"},
  #    {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"}]

g = hashify(f, "actual_data_type")
  #=> {"TRANSACTIONDATE"=>{"column_name"=>"TRANSACTIONDATE",
  #      "actual_data_type"=>"TIMESTAMP"},
  #    "NONINTERESTEXPENSE"=>{ "column_name"=>"NONINTERESTEXPENSE",
  #      "actual_data_type"=>"VARCHAR"}}

h = expected-actual
  #=> [{"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"NUMBER"},
  #    {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"NUMBER"}]

i = hashify(h, "expected_data_type")
  #=> {"NONINTERESTEXPENSE"=>{"column_name"=>"NONINTERESTEXPENSE",
  #      "expected_data_type"=>"NUMBER"},
  #    "TRANSACTIONDATE"=>{"column_name"=>"TRANSACTIONDATE",
  #      "expected_data_type"=>"NUMBER"}}

Next merge g and i using the form of Hash#merge that employs a block to determine the values of keys that are present in both hashes being merged. See the doc for the definitions of the three block variables (the first of which, the common key, I've represented by an underscore to signify that it is not used in the block calculation).

j = g.merge(i) { |_,a,e| a.merge(e) }
  #=> {"TRANSACTIONDATE"=>{"column_name"=>"TRANSACTIONDATE",
  #      "actual_data_type"=>"TIMESTAMP", "expected_data_type"=>"NUMBER"},
  #    "NONINTERESTEXPENSE"=>{"column_name"=>"NONINTERESTEXPENSE",
  #      "actual_data_type"=>"VARCHAR", "expected_data_type"=>"NUMBER"}}

Lastly, drop the keys.

k = j.values
  #=> [{"column_name"=>"TRANSACTIONDATE", "actual_data_type"=>"TIMESTAMP",
  #     "expected_data_type"=>"NUMBER"},
  #    {"column_name"=>"NONINTERESTEXPENSE", "actual_data_type"=>"VARCHAR",
  #     "expected_data_type"=>"NUMBER"}]
查看更多
再贱就再见
5楼-- · 2019-07-15 04:45

What about this?

def select(hashes_array, column_name)
  hashes_array.select { |h| h["column_name"] == column_name }.first
end

diff = (expected - actual).map do |h|
  {
    "column_name" => h["column_name"],
    "expected_column_data_type" => select(expected, h["column_name"])["column_data_type"],
    "actual_column_data_type" => select(actual, h["column_name"])["column_data_type"],
  }
end

PS: surely this code can be improved to look more elegant

查看更多
萌系小妹纸
6楼-- · 2019-07-15 04:48
(expected - actual).
  concat(actual - expected).
  group_by { |column| column['column_name'] }.
  map do |name, (expected, actual)|
    {
      'column_name'               => name,
      'expected_column_data_type' => expected['column_data_type'],
      'actual_column_data_type'   => actual['column_data_type'],
    }
  end
查看更多
登录 后发表回答