How can I remove NA values from a vector?
I have a huge vector which has a couple of NA
values, and I'm trying to find the max value in that vector (the vector is all numbers), but I can't do this because of the NA values.
How can I remove the NA values so that I can compute the max?
You can call
max(vector, na.rm = TRUE)
. More generally, you can use thena.omit()
function.The
na.omit
function is what a lot of the regression routines use internally:Just in case someone new to R wants a simplified answer to the original question
Here it is:
Assume you have a vector
foo
as follows:running
length(foo)
gives 22.length(nona_foo)
is 21, because the NA values have been removed.Remember
is.na(foo)
returns a boolean matrix, so indexingfoo
with the opposite of this value will give you all the elements which are not NA.?max
shows you that there is an extra parameterna.rm
that you can set toTRUE
.Apart from that, if you really want to remove the
NA
s, just use something like:Trying
?max
, you'll see that it actually has ana.rm =
argument, set by default toFALSE
. (That's the common default for many other R functions, includingsum()
,mean()
, etc.)Setting
na.rm=TRUE
does just what you're asking for:If you do want to remove all of the
NA
s, use this idiom instead:A final note: Other functions (e.g.
table()
,lm()
, andsort()
) haveNA
-related arguments that use different names (and offer different options). So ifNA
's cause you problems in a function call, it's worth checking for a built-in solution among the function's arguments. I've found there's usually one already there.