How can I use strings inside ddply functions?

2019-09-21 17:31发布

问题:

Just as an illustrative example, to create a function similar to countif in excel, here is what I have tried to somehow use the string "mycolumn" inside the ddply "countif" variable definition below:

df <- c("a","a","b","c","c") %>% data.frame(stringsAsFactors = F)
colnames(df) <- "mycolumn"
x <- "mycolumn"
countif <- function(df,x ) {
y <- which(colnames(df)==x)
result1 <- ddply(df,x,nrow) #this works, but I can't use the x argument
result2 <- ddply(df,x,summarise, countif=length(df[,y])) #not working
result3 <- ddply(df,x,summarise, countif=length(parse(text=x))) #not working
    }

As you can see below, only result1 works, but I need a way to be able to use my mycolumn string in the ddply function instead of solely relying on nrow. Many thanks.

> result1
  mycolumn V1
1        a  2
2        b  1
3        c  2
> result2
  mycolumn countif
1        a       5
2        b       5
3        c       5
> result3
  mycolumn countif
1        a       1
2        b       1
3        c       1

回答1:

not entirely sure if I get what you're after, but my best guess would be something like the below

library(dplyr)

df <-  data.frame(mycolumn = c("a","a","b","c","c"))

result1 <- df %>% group_by(mycolumn) %>% tally()

result3 <- df %>% filter(mycolumn %in% c("a", "b")) %>% group_by(mycolumn) %>% tally()

You can play around with the conditional inside the filter function



回答2:

OK, I found a way. Not so elegant I guess, but who cares:

countif <- function(df,x ) {
df$myartificialname <- df[,which(colnames(df)==x)]
result <- ddply(df,x,summarise,countif=length(myartificialname) )
print(paste(length(unique(result$countif)), "levels counted:", toString(head(unique(result$countif)))))
return(result$countif)
}

EDIT: actually get(x) would work fine as well



标签: r dplyr