如何简化领先-NA计数功能,并推广它的矩阵,数据帧的工作(How to simplify a lea

2019-10-19 17:19发布

我写了一个领先的-NA计数功能,它适用于载体。 然而:

A)您可以简化我的版本?

b)能否也概括它直接在矩阵工作,数据帧(必须仍对个别矢量工作),所以我不需要apply() 尽量避免所有*apply功能,完全矢量化,它仍必须在向量工作,并没有特殊的套管如果在所有可能的。

leading_NA_count <- function(x) { max(cumsum((1:length(x)) == cumsum(is.na(x)))) }
# v0.1: works but seems clunky, slow and unlikely to be generalizable to two-dimensional objects

leading_NA_count <- function(x) { max(which(1:(length(x)) == cumsum(is.na(x))), 0) }
# v0.2: maybe simpler, needs max(...,0) to avoid max failing with -Inf if the which(...) is empty/ no leading-NAs case: e.g. c(1,2,3) 

# (Seems impossible to figure out how to use which.max/which.min on this)


leading_NA_count <- function(x) { max(cumsum((1:length(x)) == cumsum(is.na(x)))) }
set.seed(1234)
mm <- matrix(sample(c(NA,NA,NA,NA,NA,0,1,2), 6*5, replace=T), nrow=6,ncol=5)
mm
     [,1] [,2] [,3] [,4] [,5]
[1,]   NA   NA   NA   NA   NA
[2,]   NA   NA    2   NA    1
[3,]   NA    0   NA   NA   NA
[4,]   NA   NA    1   NA    2
[5,]    1    0   NA   NA    1
[6,]    0   NA   NA   NA   NA

leading_NA_count(mm)
[1] 4 # WRONG, obviously (looks like it tried to operate on the entire matrix by-column or by-row)
apply(mm,1,leading_NA_count)
[1] 5 2 1 2 0 0 # RIGHT

Answer 1:

这工作是否mm是一个matrixvectordata.frame 。 见?max.col的详细信息:

max.col(cbind(!is.na(rbind(NA, mm)), TRUE), ties = "first")[-1] - 1


Answer 2:

对于你的问题的一部分(A)这是我能想到的最简单的功能:

leadingNaCount = function(x) { sum(cumprod(is.na(x))) }


文章来源: How to simplify a leading-NA count function, and generalize it to work on matrix, dataframe