I want to use R to summarize numerical data in a table with non-unique rownames to a result table with unique row-names with values summarized using a custom function. The summarization logic is: use the mean of values if the ratio of the maximum to the minimum value is < 1.5, else use median. Because the table is very large, I am trying to use the melt() and cast() functions in the reshape2 package.
# example table with non-unique row-names tab <- data.frame(gene=rep(letters[1:3], each=3), s1=runif(9), s2=runif(9)) # melt tab.melt <- melt(tab, id=1) # function to summarize with logic: mean if max/min < 1.5, else median summarize <- function(x){ifelse(max(x)/min(x)<1.5, mean(x), median(x))} # cast with summarized values dcast(tab.melt, gene~variable, summarize)
The last line of code above results in an error notice.
Error in vapply(indices, fun, .default) : values must be type 'logical', but FUN(X[[1]]) result is type 'double' In addition: Warning messages: 1: In max(x) : no non-missing arguments to max; returning -Inf 2: In min(x) : no non-missing arguments to min; returning Inf
What am I doing wrong? Note that if the summarize function were to just return min(), or max(), there is no error, though there is the warning message about 'no non-missing arguments.' Thank you for any suggestion.
(The actual table I want to work with is a 200x10000 one.)