I'm trying to read in a .csv file from the IRS and it doesn't appear to be formatted in any weird way.
I'm using the read.table()
function, which I have used several times in the past but it isn't working this time; instead, I get this error:
data_0910<-read.table("/Users/blahblahblah/countyinflow0910.csv",header=T,stringsAsFactors=FALSE,colClasses="character")
Error in read.table("/Users/blahblahblah/countyinflow0910.csv", :
more columns than column names
Why is it doing this?
For reference, the .csv
files can be found at:
http://www.irs.gov/uac/SOI-Tax-Stats-County-to-County-Migration-Data-Files
(The ones I need are under the county to county migration .csv section - either inflow or outflow.)
For the Germans:
you have to change your decimal commas into a Full stop in your csv-file (in Excel:File -> Options -> Advanced -> "Decimal seperator") , then the error is solved.
It uses commas as separators. So you can either set
sep=","
or just useread.csv
:The error is caused by spaces in some of the values, and unmatched quotes. There are no spaces in the header, so
read.table
thinks that there is one column. Then it thinks it sees multiple columns in some of the rows. For example, the first two lines (header and first row):And unmatched quotes, for example on line 1336 (row 1335) which will confuse
read.table
with the defaultquote
argument (but notread.csv
):you have have strange characters in your heading # % -- or ,