Replacing missing values coded by “.” in an R data

2019-07-22 13:15发布

问题:

I have a dataframe with missing values coded by ".", and I want to recode the values as NA:

df <- data.frame("h"=c(1,1,"."))

I try the following:

df$h[df$h == "."] <- NA

But the NA appears as a <NA>, and I can't execute commands like mean(df$h,rm.na=TRUE)

Does anyone know what the problem is? When I recode numbers as NA there's no problem

Thanks!

回答1:

Use the is.na function. No need to convert to factor, although the fact that you had character values did cause coercion of what you wanted to be numeric.

> df <- data.frame("h"=c(1,1,"."))
> is.na(df) <- df=="."
> df
     h
1    1
2    1
3 <NA>

I'm not sure why @TylerRinker deleted his response regarding using 'na.strings', since I thought it to be the correct answer.

Comment: Looking at this a year later I realized that a) the OP misunderstood how missing values were displayed when they are in factors or character vectors, and b) that the main problem was not an error in recoding to an R-missing-value, which the OP's code already correctly had done correctly, but was rather the misspelling error that @joran identified.



回答2:

The problem is that your column df$h is a factor. Try to make it a character first and then replace the "."-values:

df$h <- as.character(df$h)
df$h[df$h == "."] <- NA

Here you see the result:

df[is.na(df$h),]

Of course once you got rid of the dots you can convert it to a numeric variable to calculate with it if you want:

df$h <- as.numeric(df$h)


回答3:

Yes, right, it is a factor. first convert it into numeric by the below syntax

df <- transform(df, h=as.numeric(h)) 

and replace missing with zero

df$h[is.na(df$h)] <- "0" and then view the data View(df)