read.table reads numbers as factors

2020-08-23 08:48发布

I have the following sample file:

"id";"PCA0";"PCA1";"PCA2"
1;6.142741644872954;1.2075898020608253;1.8946959360032403   
2;-0.5329026419681557;-8.586870627925729;4.510113575138726

When I try to read it with:

d <- read.table("file.csv", sep=";", header=T)

id is a integer column, PCA0 a numeric an all subsequent columns are factors

class(d$iid)
[1] "integer"
class(d$PCA0)
[1] "numeric"
class(d$PCA1)
[1] "factor"
class(d$PCA2)
[1] "factor"

Why aren't the other columns numeric as well?

I know how to convert the columns, but I want my script to work without manually casting the types. Why doesn't R recognize the numeric columns?

标签： r

2条回答

老娘就宠你

2楼-- · 2020-08-23 09:26

This was a change make in R 3.1. There as been much discussion on the R-devel list about this. Basically if a number has too many digits, it's converted to a factor. This behavior is supposted be be reverted in 3.1.1 but no release date has been set as far as I know.

0人赞添加讨论(0) 举报

Rolldiameter

3楼-- · 2020-08-23 09:29

as @MrFlick says: too many digits.

you can force what you want by specifying colClasses argument:

read.table("test.csv",
                sep=";",
                header=TRUE,
                colClasses=c("integer","numeric","numeric","numeric"))

if you really need as much precision as possible:

require(data.table)
d <- fread("test.csv")

Then modify to maximum precision stored:

d[,PCA0 := sprintf("%.15E",PCA0)]
d[,PCA1 := sprintf("%.15E",PCA1)]
d[,PCA2 := sprintf("%.15E",PCA2)]

gives:

> d
   id                   PCA0                   PCA1                  PCA2
1:  1  6.142741644872954E+00  1.207589802060825E+00    1.8946959360032403   
2:  2 -5.329026419681557E-01 -8.586870627925729E+00     4.510113575138726

note: fread should be smater + faster.

0人赞添加讨论(0) 举报

read.table reads numbers as factors

if you really need as much precision as possible:

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间