Replacing missing values coded by “.” in an R data

I have a dataframe with missing values coded by ".", and I want to recode the values as NA:

df <- data.frame("h"=c(1,1,"."))

I try the following:

df$h[df$h == "."] <- NA

But the NA appears as a <NA>, and I can't execute commands like mean(df$h,rm.na=TRUE)

Does anyone know what the problem is? When I recode numbers as NA there's no problem

Thanks!

标签： r replace dataframe

3条回答

迷人小祖宗

2楼-- · 2019-07-22 13:53

Use the is.na function. No need to convert to factor, although the fact that you had character values did cause coercion of what you wanted to be numeric.

> df <- data.frame("h"=c(1,1,"."))
> is.na(df) <- df=="."
> df
     h
1    1
2    1
3 <NA>

I'm not sure why @TylerRinker deleted his response regarding using 'na.strings', since I thought it to be the correct answer.

Comment: Looking at this a year later I realized that a) the OP misunderstood how missing values were displayed when they are in factors or character vectors, and b) that the main problem was not an error in recoding to an R-missing-value, which the OP's code already correctly had done correctly, but was rather the misspelling error that @joran identified.

0人赞添加讨论(0) 举报

女痞

3楼-- · 2019-07-22 13:58

The problem is that your column df$h is a factor. Try to make it a character first and then replace the "."-values:

df$h <- as.character(df$h)
df$h[df$h == "."] <- NA

Here you see the result:

df[is.na(df$h),]

Of course once you got rid of the dots you can convert it to a numeric variable to calculate with it if you want:

df$h <- as.numeric(df$h)

0人赞添加讨论(0) 举报

三岁会撩人

4楼-- · 2019-07-22 14:11

Yes, right, it is a factor. first convert it into numeric by the below syntax

df <- transform(df, h=as.numeric(h))

and replace missing with zero

df$h[is.na(df$h)] <- "0" and then view the data View(df)

0人赞添加讨论(0) 举报

Replacing missing values coded by “.” in an R data

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间