I have the following sample file:
"id";"PCA0";"PCA1";"PCA2"
1;6.142741644872954;1.2075898020608253;1.8946959360032403
2;-0.5329026419681557;-8.586870627925729;4.510113575138726
When I try to read it with:
d <- read.table("file.csv", sep=";", header=T)
id
is a integer column, PCA0
a numeric an all subsequent columns are factors
class(d$iid)
[1] "integer"
class(d$PCA0)
[1] "numeric"
class(d$PCA1)
[1] "factor"
class(d$PCA2)
[1] "factor"
Why aren't the other columns numeric as well?
I know how to convert the columns, but I want my script to work without manually casting the types. Why doesn't R recognize the numeric columns?
This was a change make in R 3.1. There as been much discussion on the R-devel list about this. Basically if a number has too many digits, it's converted to a factor. This behavior is supposted be be reverted in 3.1.1 but no release date has been set as far as I know.
as @MrFlick says: too many digits.
you can force what you want by specifying
colClasses
argument:if you really need as much precision as possible:
Then modify to maximum precision stored:
gives:
note:
fread
should be smater + faster.