I got this question using read.table()
with or without header=T
, trying to extract a vector of doubles from the resulting data.frame
with as.double(as.character())
(see ?factor
).
But that's just how I realized that I don't understand R's logic. So you won't see e.g. read.table
in the code below, only the necessary parts. Could you tell me what's the difference between the following options?
With
header=T
equivalent:(a <- data.frame(array(c(0.5,0.5,0.5,0.5), c(1,4)))) as.character(a) # [1] "0.5" "0.5" "0.5" "0.5"
Without
header=T
equivalent:b <- data.frame(array(c("a",0.5,"b",0.5,"c",0.5,"d",0.5), c(2,4))) (a <- b[2,]) as.character(a) # [1] "1" "1" "1" "1" (a <- data.frame(a, row.names=NULL)) # now there's not even a visual difference as.character(a) # [1] "1" "1" "1" "1"
The problem lies in the default setting of
data.frame
, where one of the options,stringsAsFactors
is set toTRUE
. This is a problem in your scenario because when you useheader = FALSE
, the presence of character values in that row coerces the entire column to characters, which is then converted to factors (unless you setstringsAsFactors = FALSE
).Here are some examples to play with: