Ubuntu 12.04 LTS
Ruby ruby 1.9.3dev (2011-09-23 revision 33323) [i686-linux]
Rails 3.2.9
Following is the content of my received CSV file:
"date/time","settlement id","type","order id","sku","description","quantity","marketplace","fulfillment","order city","order state","order postal","product sales","shipping credits","gift wrap credits","promotional rebates","sales tax collected","selling fees","fba fees","other transaction fees","other","total"
"Mar 1, 2013 12:03:54 AM PST","5481545091","Order","108-0938567-7009852","ALS2GL36LED","Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor","1","amazon.com","Amazon","Pasadena","CA","91104-1056","43.00","3.25","0","-3.25","0","-6.45","-3.75","0","0","32.80"
However when I am trying to parse the CSV file I am getting error:
1.9.3dev :016 > options = { col_sep: ",", quote_char:'"' }
=> {:col_sep=>",", :quote_char=>"\""}
1.9.3dev :022 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }
CSV::MalformedCSVError: Illegal quoting in line 1.
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'
from (irb):22
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>'
Then I tried simplifying the data i.e.
"name","age","email"
"jignesh","30","jignesh@example.com"
however still I am getting the same error:
1.9.3dev :023 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }
CSV::MalformedCSVError: Illegal quoting in line 1.
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'
from (irb):23
from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>'
Again I tried simplifying the data like this:
name,age,email
jignesh,30,jignesh@example.com
and it works.See the output below:
1.9.3dev :024 > CSV.foreach("/tmp/my_data.csv") { |row| puts row }
name
age
email
jignesh
30
jignesh@example.com
=> nil
But I will be receiving the CSV files having quoted data so removing quotes solution is not actually I am looking for.I am unable to figure out what is causing the error: CSV::MalformedCSVError: Illegal quoting in line 1. in my earlier examples.
I have verified that in the CSV there are no leading/trailing spaces by enabling "Show whitespace characters" and "Show Line Endings" in my text editor.Also I have verified the encoding using following.
1.9.3dev :026 > File.open("/tmp/my_data.csv").read.encoding
=> #<Encoding:UTF-8>
Note: I tried using CSV.read too but same error with that method.
Can anybody please help me getting out of the problem and make me understand where it is going wrong?
=====================
I just found following post at: http://www.ruby-forum.com/topic/448070 and tried following:
file_data = file.read
file_data.gsub!('"', "'")
arr_of_arrs = CSV.parse(file_data)
arr_of_arrs.each do |arr|
Rails.logger.debug "=======#{arr}"
end
and got the following output:
=======["\xEF\xBB\xBF'date/time'", "'settlement id'", "'type'", "'order id'", "'sku'", "'description'", "'quantity'", "'marketplace'", "'fulfillment'", "'order city'", "'order state'", "'order postal'", "'product sales'", "'shipping credits'", "'gift wrap credits'", "'promotional rebates'", "'sales tax collected'", "'selling fees'", "'fba fees'", "'other transaction fees'", "'other'", "'total'"]
=======["'Mar 1", " 2013 12:03:54 AM PST'", "'5481545091'", "'Order'", "'108-0938567-7009852'", "'ALS2GL36LED'", "'Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor'", "'1'", "'amazon.com'", "'Amazon'", "'Pasadena'", "'CA'", "'91104-1056'", "'43.00'", "'3.25'", "'0'", "'-3.25'", "'0'", "'-6.45'", "'-3.75'", "'0'", "'0'", "'32.80'"]
which messed up reading the data properly as the default col_sep used is a comma character. However I tried using quote_char option like this:
arr_of_arrs = CSV.parse(file_data, :quote_char => "'")
but it ended up the following error:
CSV::MalformedCSVError (Illegal quoting in line 1.):
Thanks, Jignesh
Anand, thank you for the encoding suggestion. This solved the illegal quoting problem for me.
Note: If you want the iterator to skip over the header row add
headers: :first_row
, like so:Try this hint:
I attempted to read the file and get a string and then parse thes tring into a CSV table, but received an exception:
None of the answers provided here worked for me. In fact, the one with highest votes was taking so long to parse that eventually I terminated the execution. It most likely was raising many exceptions, and that time is costly on a large file.
Even more problematic, the error is not so helpful, since it is a large CSV file. Where exactly is line 1794? I opened up the file in LibreOffice which opened without any problems. Line 1794 was the last row of data of the csv file. So apparently the problem had to do with the end of the CSV file. I decided to inspect the contents as a string with File.read. I noticed the string ended with a carriage return character:
I decided to use chomp and remove the carriage return at the end of file. Note if $/ has not been changed from the default Ruby record separator, then chomp also removes carriage return characters (that is it will remove \n, \r, and \r\n).
And it worked. The problem was the \r character at the end of the file.
I just had an issue like this and discovered that CSV does not like spaces between the col-sep and the quote character. Once I removed those everything went fine. So I had:
but once I gsubed out the spaces using
resulting in
everything went fine.
I had a problem with the trademark character that was throwing this error.
The trademark character translates to \"! in UTF-8, so it was the open-ended quotation symbol that was throwing the error. So I did this:
.gsub!("\"!", "")
And then I tried creating my CSV object and it worked fine.
it's not perfect but it works most of the time.
N.B.
CSV.parse
takes the same parameters asCSV.read
, so either a file or data from memory can be used