R is there a way to find Inf/-Inf values?

2020-02-28 04:38发布

I'm trying to run a randomForest on a large-ish data set (5000x300). Unfortunately I'm getting an error message as follows:

> RF <- randomForest(prePrior1, postPrior1[,6]
+                    ,,do.trace=TRUE,importance=TRUE,ntree=100,,forest=TRUE)
Error in randomForest.default(prePrior1, postPrior1[, 6], , do.trace = TRUE,  : 
  NA/NaN/Inf in foreign function call (arg 1)

So I try to find any NA's using :

> df2 <- prePrior1[is.na(prePrior1)]
> df2 
character(0)
> df2 <- postPrior1[is.na(postPrior1[,6])]
> df2 
numeric(0)

which leads me to believe that it's Inf's that are the problem as there don't seem to be any NA's.

Any suggestions for how to root out Inf's?

5条回答
beautiful°
2楼-- · 2020-02-28 04:52

You're probably looking for is.finite, though I'm not 100% certain that the problem is Infs in your input data.

Be sure to read the help for is.finite carefully about which combinations of missing, infinite, etc. it picks out. Specifically, this:

> is.finite(c(1,NA,-Inf,NaN))
[1]  TRUE FALSE FALSE FALSE
> is.infinite(c(1,NA,-Inf,NaN))
[1] FALSE FALSE  TRUE FALSE

One of these things is not like the others. Not surprisingly, there's an is.nan function as well.

查看更多
该账号已被封号
3楼-- · 2020-02-28 04:54

In analogy to is.na, you can use is.infinite to find occurrences of infinites.

查看更多
成全新的幸福
4楼-- · 2020-02-28 04:54

Take a look at with, e.g.:

> with(df, df == Inf)
        foo   bar   baz   abc ...
[1,]  FALSE FALSE  TRUE FALSE ...
[2,]  FALSE  TRUE FALSE FALSE ...
...
查看更多
做个烂人
5楼-- · 2020-02-28 04:59

joran's answer is what you want and informative. For more details about is.na() and is.infinite(), you should check out https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/is.na-methods.html and besides, after you get the logical vector which says whether each element of the original vector is NA/Inf, you can use the which() function to get the indices, just like this:

> v1 <- c(1, Inf, 2, NaN, Inf, 3, NaN, Inf)
> is.infinite(v1)
[1] FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE
> which(is.infinite(v1))
[1] 2 5 8
> is.na(v1)
[1] FALSE FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE
> which(is.na(v1))
[1] 4 7

the document for which() is here https://stat.ethz.ch/R-manual/R-devel/library/base/html/any.html

查看更多
Deceive 欺骗
6楼-- · 2020-02-28 05:03

randomForest's 'NA/NaN/Inf in foreign function call' is often a false warning, and really irritating:

  • you will get this if any of the variables passed is character
  • actual NaNs and Infs almost never happen in clean data

Fast and dirty trick to narrow things down, do a binary-search on your variable list, and use token parameters like ntree=2 to get an instant pass/fail on the subset of variables:

RF <- randomForest(prePrior1[m:n],ntree=2,...)
查看更多
登录 后发表回答