So I am attempting to run the 'genie3' algorithm (ref: http://homepages.inf.ed.ac.uk/vhuynht/software.html) in R which uses the 'randomForest' method.
I am running into the following Error:
> weight.matrix<-get.weight.matrix(tmpLog2FC, input.idx=1:4551)
Starting RF computations with 1000 trees/target gene,
and 67 candidate input genes/tree node
Computing gene 1/11805
Show Traceback
Rerun with Debug
Error in randomForest.default(x, y, mtry = mtry, ntree = nb.trees, importance = TRUE, :
NA not permitted in predictors
So I checked if NAs are present in my data, and there are none:
> NAs<-sapply(tmpLog2FC, function(x) sum(is.na(x)))
> length(which(NAs!=0))
[1] 0
I then tried editing the specific 'get.weight.matrix()' function to omit NAs (just in case) by changing this line:
rf <- randomForest(x, y, mtry=mtry, ntree=nb.trees, importance=TRUE, ...)
To:
rf <- randomForest(x, y, mtry=mtry, ntree=nb.trees, importance=TRUE, na.action=na.omit)
I then sourced the code, and double checked that it incorporated the changes by calling it on its own (and displaying the actual script):
}
target.gene.name <- gene.names[target.gene.idx]
# remove target gene from input genes
these.input.gene.names <- setdiff(input.gene.names, target.gene.name)
x <- expr.matrix[,these.input.gene.names]
y <- expr.matrix[,target.gene.name]
rf <- randomForest(x, y, mtry=mtry, ntree=nb.trees, importance=TRUE, na.action=na.omit)
However when attempting to re-run, I get the same error:
Error in randomForest.default(x, y, mtry = mtry, ntree = nb.trees, importance = TRUE, :
NA not permitted in predictors
Has anyone encountered anything similar to this? Any ideas on what I can do?
Thanks in advance.
*EDIT: As suggested, I re-ran with debug:
> weight.matrix<-get.weight.matrix(tmpLog2FC, input.idx=1:4551)
Starting RF computations with 1000 trees/target gene,
and 67 candidate input genes/tree node
Computing gene 1/11805
Error in randomForest.default(x, y, mtry = mtry, ntree = nb.trees, importance = TRUE, :
NA not permitted in predictors
Called from: randomForest(x, y, mtry = mtry, ntree = nb.trees, importance = TRUE,
na.action = na.omit)
Browse[1]>
>
The debug shows that the line that I suspected is throwing the error, but it displays it in the edited form with 'na.action=na.omit'. I am even more confused. How can a dataset that has no NAs, run with a code that allows for NAs to be omitted, display this error?