Removing whitespaces in a CSV file

2019-04-04 15:03发布

I have a string with extra whitespace:

First,Last,Email  ,Mobile Phone ,Company,Title  ,Street,City,State,Zip,Country, Birthday,Gender ,Contact Type

I want to parse this line and remove the whitespaces.

My code looks like:

namespace :db do
task :populate_contacts_csv => :environment do

require 'csv'

csv_text = File.read('file_upload_example.csv')
  csv = CSV.parse(csv_text, :headers => true)
    csv.each do |row|
      puts "First Name: #{row['First']} \nLast Name: #{row['Last']} \nEmail: #{row['Email']}"
    end
  end
end

3条回答
何必那么认真
2楼-- · 2019-04-04 15:31

CSV supports "converters" for the headers and fields, which let you get inside the data before it's passed to your each loop.

Writing a sample CSV file:

csv = "First,Last,Email  ,Mobile Phone ,Company,Title  ,Street,City,State,Zip,Country, Birthday,Gender ,Contact Type
first,last,email  ,mobile phone ,company,title  ,street,city,state,zip,country, birthday,gender ,contact type
"
File.write('file_upload_example.csv', csv)

Here's how I'd do it:

require 'csv'
csv = CSV.open('file_upload_example.csv', :headers => true)
[:convert, :header_convert].each { |c| csv.send(c) { |f| f.strip } }

csv.each do |row|
  puts "First Name: #{row['First']} \nLast Name: #{row['Last']} \nEmail: #{row['Email']}"
end

Which outputs:

First Name: 'first'
Last Name: 'last'
Email: 'email'

The converters simply strip leading and trailing whitespace from each header and each field as they're read from the file.

Also, as a programming design choice, don't read your file into memory using:

csv_text = File.read('file_upload_example.csv')

Then parse it:

csv = CSV.parse(csv_text, :headers => true)

Then loop over it:

csv.each do |row|

Ruby's IO system supports "enumerating" over a file, line by line. Once my code does CSV.open the file is readable and the each reads each line. The entire file doesn't need to be in memory at once, which isn't scalable (though on new machines it's becoming a lot more reasonable), and, if you test, you'll find that reading a file using each is extremely fast, probably equally fast as reading it, parsing it then iterating over the parsed file.

查看更多
成全新的幸福
3楼-- · 2019-04-04 15:32
@prices = CSV.parse(IO.read('prices.csv'), :headers=>true, 
   :header_converters=> lambda {|f| f.strip},
   :converters=> lambda {|f| f ? f.strip : nil})

The nil test is added to the row but not header converters assuming that the headers are never nil, while the data might be, and nil doesn't have a strip method. I'm really surprised that, AFAIK, :strip is not a pre-defined converter!

查看更多
在下西门庆
4楼-- · 2019-04-04 15:45

You can strip your hash first:

csv.each do |unstriped_row|
  row = {}
  unstriped_row.each{|k, v| row[k.strip] = v.strip}
  puts "First Name: #{row['First']} \nLast Name: #{row['Last']} \nEmail: #{row['Email']}"
end

Edited to strip hash keys too

查看更多
登录 后发表回答