How to count occurrence of unknown strings in colu

2020-08-01 05:52发布

问题:

I have another question. Thanks for everyone's help and patience with an R newbie!

How can I count how many times a string occurs in a column? Example:

MYdata <- data.frame(fruits = c("apples", "pears", "unknown_f", "unknown_f", "unknown_f"), 
                     veggies = c("beans", "carrots", "carrots", "unknown_v", "unknown_v"), 
                     sales = rnorm(5, 10000, 2500))

The problem is that my real data set contains several thousand rows and several hundred of the unknown fruits and unknown veggies. I played around with "table()" and "levels" but without much success. I guess it's more complicated than that. Great would be to have an output table listing the name of each unique fruit/veggie and how many times it occurs in its column. Any hint in the right direction would be much appreciated.

Thanks,

Marcus

回答1:

If I understand your question, the function table() should work just fine. Here is how:

table(MYdata$fruits)

   apples     pears unknown_f 
        1         1         3 
table(MYdata$veggies)

    beans   carrots unknown_v 
        1         2         2 

Or use table inside lapply:

lapply(MYdata[1:2], table)
$fruits

   apples     pears unknown_f 
        1         1         3 

$veggies

    beans   carrots unknown_v 
        1         2         2 


回答2:

The following gives you a data frame of counts which you might find easier to use or may suit your purposes better:

tabs=lapply(MYdata[-3], table)
out=data.frame(item=names(unlist(tabs)),count=unlist(tabs)[],
               stringsAsFactors=FALSE)
rownames(out)=c()

print(out)

               item count
1     fruits.apples     1
2      fruits.pears     1
3  fruits.unknown_f     3
4     veggies.beans     1
5   veggies.carrots     2
6 veggies.unknown_v     2


回答3:

Maybe something like

summary(MYdata$fruits)


标签: string r count