I wrote a leading-NA count function, it works on vectors. However:
a) Can you simplify my version?
b) Can you also generalize it to work directly on matrix, dataframe (must still work on individual vector), so I don't need apply()
? Try to avoid all *apply
functions, fully vectorize, it must still work on a vector, and no special-casing if at all possible.
leading_NA_count <- function(x) { max(cumsum((1:length(x)) == cumsum(is.na(x)))) }
# v0.1: works but seems clunky, slow and unlikely to be generalizable to two-dimensional objects
leading_NA_count <- function(x) { max(which(1:(length(x)) == cumsum(is.na(x))), 0) }
# v0.2: maybe simpler, needs max(...,0) to avoid max failing with -Inf if the which(...) is empty/ no leading-NAs case: e.g. c(1,2,3)
# (Seems impossible to figure out how to use which.max/which.min on this)
leading_NA_count <- function(x) { max(cumsum((1:length(x)) == cumsum(is.na(x)))) }
set.seed(1234)
mm <- matrix(sample(c(NA,NA,NA,NA,NA,0,1,2), 6*5, replace=T), nrow=6,ncol=5)
mm
[,1] [,2] [,3] [,4] [,5]
[1,] NA NA NA NA NA
[2,] NA NA 2 NA 1
[3,] NA 0 NA NA NA
[4,] NA NA 1 NA 2
[5,] 1 0 NA NA 1
[6,] 0 NA NA NA NA
leading_NA_count(mm)
[1] 4 # WRONG, obviously (looks like it tried to operate on the entire matrix by-column or by-row)
apply(mm,1,leading_NA_count)
[1] 5 2 1 2 0 0 # RIGHT