read.csv falsly converts string to integer

2020-04-17 02:35发布

问题:

I would like to read a csv file but there are columns that contain strings of digits (string variable). The values in the csv file are quoted ("") so easily identifyable as string variables but for some reason they end up as integer in my data.frame.

Here is the head of the csv

"task","itemnr","respnr","checked","solution","score","userid","filenr","timestamp","swmClicks","swmRT"
"swm",1,"E1","010010010","000111000",0,"77279","77279","2017-02-14T12:58:56.457+0430",3,13.0379998683929
"swm",10,"E1","011001000","011001000",1,"77279","77279","2017-02-14T13:01:50.717+0430",6,20.4059998989105

The problem is with the 4th and 5th column.

This is the code I use. Anything wrong with it?

datSwm <- read.csv("datSwm.csv", header=T, stringsAsFactors=FALSE, quote='\"')

回答1:

Try this :

datSwm <- read.csv("datSwm.csv", header=T, stringsAsFactors=FALSE, quote='\"',colClasses=c("character","numeric","character","character","character","numeric","character","character","character","numeric","numeric"))



回答2:

You could use the read.csv argument: colClasses

colClasses describes the content of the columns (see ?read.csv).

below an example for the first five columns: you need to drop stringAsFactors (it would be overridden by colClasses)

datSwm <- read.csv("datSwm.csv", header=T, quote='\"', 
colClasses = c("factor", "numeric", "character", "character", "character") )

You will need to add more details for the remaining columns.



回答3:

You can use as.character() on your two columns.

Example :

vec <- c(1,2,3)
> vec
[1] 1 2 3

vec <- as.character(vec)
> vec
[1] "1" "2" "3"

So just write :

datSwm[,4:5] <- as.character(datSwm[,4:5])


标签: r import