可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have 2 array of hashes:

actual = [{"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"},
 {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"},
 {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"},
 {"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}]
expected = [{"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"},
 {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"NUMBER"},
 {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"NUMBER"},
 {"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}]

I need to compare these 2 hashes and find out the ones for which the column_data_type differs.

to compare we can directly use:

diff = actual -   expected

This will print the output as:

{"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"}
{"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"}

My expected output is that in the result i want to print the actual and expected datatype, means the datatypes for the missing `column_name' from both the actual and expected array of hashes, something like:

{"column_name"=>"NONINTERESTEXPENSE", "expected_column_data_type"=>"NUMBER", "actual_column_data_type" => "VARCHAR"}
{"column_name"=>"TRANSACTIONDATE", "expected_column_data_type"=>"NUMBER","actual_column_data_type" => "TIMESTAMP" }

回答1:

(expected - actual).
  concat(actual - expected).
  group_by { |column| column['column_name'] }.
  map do |name, (expected, actual)|
    {
      'column_name'               => name,
      'expected_column_data_type' => expected['column_data_type'],
      'actual_column_data_type'   => actual['column_data_type'],
    }
  end

回答2:

This will work irrespective of order of hashes in your array.

diff = []

expected.each do |elem|
  column_name = elem['column_name']
  column_type = elem['column_data_type']
  match = actual.detect { |elem2| elem2['column_name'] == column_name  }
  if column_type != match['column_data_type']
    diff << { 'column_name' => column_name,
              'expected_column_data_type' => column_type,
              'actual_column_data_type' => match['column_data_type'] }
  end
end

p diff

回答3:

[actual, expected].map { |a| a.map(&:dup).map(&:values) }
                  .map(&Hash.method(:[]))
                  .reduce do |actual, expected|
                    actual.merge(expected) do |k, o, n|
                      o == n ? nil : {name: k, actual: o, expected: n}
                    end
                  end.values.compact

#⇒ [
#    [0] {
#            :name => "NONINTERESTEXPENSE",
#          :actual => "VARCHAR",
#        :expected => "NUMBER"
#    },
#    [1] {
#            :name => "TRANSACTIONDATE",
#          :actual => "TIMESTAMP",
#        :expected => "NUMBER"
#    }
# ]

The method above easily expandable to merge N arrays (use reduce.with_index and merge with key "value_from_#{idx}".)

回答4:

What about this?

def select(hashes_array, column_name)
  hashes_array.select { |h| h["column_name"] == column_name }.first
end

diff = (expected - actual).map do |h|
  {
    "column_name" => h["column_name"],
    "expected_column_data_type" => select(expected, h["column_name"])["column_data_type"],
    "actual_column_data_type" => select(actual, h["column_name"])["column_data_type"],
  }
end

PS: surely this code can be improved to look more elegant

回答5:

Code

def convert(actual, expected)
  hashify(actual-expected, "actual_data_type").
  merge(hashify(expected-actual, "expected_data_type")) { |_,a,e| a.merge(e) }.values
end

def hashify(arr, key)
  arr.each_with_object({}) { |g,h| h[g["column_name"]] =
    { "column_name"=>g["column_name"], key=>g["column_data_type"] } }
end

Example

actual = [
  {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"},
  {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"},
  {"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"},
  {"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}
]

expected = [
  {"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"},
  {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"NUMBER"},
  {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"NUMBER"},
  {"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}
]

convert(actual, expected)
  #=> [{"column_name"=>"TRANSACTIONDATE",
  #     "actual_data_type"=>"TIMESTAMP", "expected_data_type"=>"NUMBER"},
  #    {"column_name"=>"NONINTERESTEXPENSE",
  #     "actual_data_type"=>"VARCHAR", "expected_data_type"=>"NUMBER"}]

Explanation

For the example above the steps are as follows.

First hashify actual and expected.

f = actual-expected
  #=> [{"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"},
  #    {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"}]

g = hashify(f, "actual_data_type")
  #=> {"TRANSACTIONDATE"=>{"column_name"=>"TRANSACTIONDATE",
  #      "actual_data_type"=>"TIMESTAMP"},
  #    "NONINTERESTEXPENSE"=>{ "column_name"=>"NONINTERESTEXPENSE",
  #      "actual_data_type"=>"VARCHAR"}}

h = expected-actual
  #=> [{"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"NUMBER"},
  #    {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"NUMBER"}]

i = hashify(h, "expected_data_type")
  #=> {"NONINTERESTEXPENSE"=>{"column_name"=>"NONINTERESTEXPENSE",
  #      "expected_data_type"=>"NUMBER"},
  #    "TRANSACTIONDATE"=>{"column_name"=>"TRANSACTIONDATE",
  #      "expected_data_type"=>"NUMBER"}}

Next merge g and i using the form of Hash#merge that employs a block to determine the values of keys that are present in both hashes being merged. See the doc for the definitions of the three block variables (the first of which, the common key, I've represented by an underscore to signify that it is not used in the block calculation).

j = g.merge(i) { |_,a,e| a.merge(e) }
  #=> {"TRANSACTIONDATE"=>{"column_name"=>"TRANSACTIONDATE",
  #      "actual_data_type"=>"TIMESTAMP", "expected_data_type"=>"NUMBER"},
  #    "NONINTERESTEXPENSE"=>{"column_name"=>"NONINTERESTEXPENSE",
  #      "actual_data_type"=>"VARCHAR", "expected_data_type"=>"NUMBER"}}

Lastly, drop the keys.

k = j.values
  #=> [{"column_name"=>"TRANSACTIONDATE", "actual_data_type"=>"TIMESTAMP",
  #     "expected_data_type"=>"NUMBER"},
  #    {"column_name"=>"NONINTERESTEXPENSE", "actual_data_type"=>"VARCHAR",
  #     "expected_data_type"=>"NUMBER"}]