Most efficient way to format a hash of data?

2019-05-18 12:06发布

问题:

I am using this array of hashes to do a batch insert into a mongo DB. Each hash was populated by parsing a text file so the formatting of fields are in an unpredictable format. It might look something like:

{date => "March 5", time => "05:22:21", first_name = "John", middle_initial = "JJ", ...}

And I would have a series of formatting functions. So maybe:

def format_date
..convert if needed..
end

def format_time
...
end

How would I go about calling the formatting functions on various records? I could see doing some kind of lambda call where I iterate through the hash and call a format_record_name function, but not all records will have formatting functions. For instance above the first_name record wouldn't need one. Any ideas?

回答1:

Here's one idea, pretty similar to what you stated. You might just have an identity function for the fields you don't want to format

def pass(x)
   x
end 

method_hash = {:date=>method(:your_format_date)}
method_hash.default = method(:pass)

x = {:date => "March 5", :time => "05:22:21", :first_name => "John", :middle_initial => "JJ"}
x.reduce({}) { |hsh,k|  hsh[k[0]] = method_hash[k[0]].call(k[1]); hsh }


回答2:

Just keep a list of the keys that you do want to handle. You could even tie it to the transformation functions with a Hash:

transformations = { 
  :date => lambda {|date| whatever},
  :time => lambda {|time| whatever} 
}
transformations.default = lambda {|v| v}

data.map do |hash|
  Hash[ hash.map {|key, val| transformations[key][val] } ]
end


回答3:

Make use of Ruby's Singleton (or Eigen) class and then the following one liner solves your problem:

module Formatter
  def format_date
    Date.parse(self[:date]).strftime('%Y-%m-%d')
  end

  def format_time
    self[:time].split(':')[0,2].join('-')
  end

  def format_first_name
    self[:first_name].upcase
  end

  def format
    {:date => format_date, :time => format_time, :first_name => format_first_name, :last_name => self[:last_name]}
  end
end

records = [
  {:date => 'March 05', :time => '12:13:00', :first_name => 'Wes', :last_name => 'Bailey'},
  {:date => 'March 06', :time => '09:15:11', :first_name => 'Joe', :last_name => 'Buck'},
  {:date => 'March 07', :time => '18:35:48', :first_name => 'Troy', :last_name => 'Aikmen'},
]

records.map {|h| h.extend(Formatter).format}
=> [{:date=>"2011-03-05", :time=>"12-13", :first_name=>"WES", :last_name=>"Bailey"},
 {:date=>"2011-03-06", :time=>"09-15", :first_name=>"JOE", :last_name=>"Buck"},
 {:date=>"2011-03-07", :time=>"18-35", :first_name=>"TROY", :last_name=>"Aikmen"}] 


回答4:

class Formatters
    def self.time(value)
        "FORMATTED TIME"
    end

    def self.date(value)
        "FORMATTED DATE"
    end

    def self.method_missing(name, arg)
        arg
    end
end

your_data = [{:date => "March 5", :time => "05:22:21", :first_name => "John", :middle_initial => "JJ"},
             {:date => "March 6", :time => "05:22:22", :first_name => "Peter", :middle_initial => "JJ"},
             {:date => "March 7", :time => "05:22:23", :first_name => "Paul", :middle_initial => "JJ"}]

formatted_data = your_data.map do |item|    
    Hash[ *item.map { |k, v| [k, Formatters.send(k, v)] }.flatten ]
end