Replacing missing values coded by “.” in an R data

2019-07-22 13:25发布

I have a dataframe with missing values coded by ".", and I want to recode the values as NA:

df <- data.frame("h"=c(1,1,"."))

I try the following:

df$h[df$h == "."] <- NA

But the NA appears as a <NA>, and I can't execute commands like mean(df$h,rm.na=TRUE)

Does anyone know what the problem is? When I recode numbers as NA there's no problem

Thanks!

3条回答
迷人小祖宗
2楼-- · 2019-07-22 13:53

Use the is.na function. No need to convert to factor, although the fact that you had character values did cause coercion of what you wanted to be numeric.

> df <- data.frame("h"=c(1,1,"."))
> is.na(df) <- df=="."
> df
     h
1    1
2    1
3 <NA>

I'm not sure why @TylerRinker deleted his response regarding using 'na.strings', since I thought it to be the correct answer.

Comment: Looking at this a year later I realized that a) the OP misunderstood how missing values were displayed when they are in factors or character vectors, and b) that the main problem was not an error in recoding to an R-missing-value, which the OP's code already correctly had done correctly, but was rather the misspelling error that @joran identified.

查看更多
女痞
3楼-- · 2019-07-22 13:58

The problem is that your column df$h is a factor. Try to make it a character first and then replace the "."-values:

df$h <- as.character(df$h)
df$h[df$h == "."] <- NA

Here you see the result:

df[is.na(df$h),]

Of course once you got rid of the dots you can convert it to a numeric variable to calculate with it if you want:

df$h <- as.numeric(df$h)
查看更多
三岁会撩人
4楼-- · 2019-07-22 14:11

Yes, right, it is a factor. first convert it into numeric by the below syntax

df <- transform(df, h=as.numeric(h)) 

and replace missing with zero

df$h[is.na(df$h)] <- "0" and then view the data View(df)
查看更多
登录 后发表回答