There is a table shown below
Name Mon Tue Wed Thu Fri Sat Sun
1 John Apple Orange Apple Banana Apple Apple Orange
2 Ricky Banana Apple Banana Banana Banana Banana Apple
3 Alex Apple Orange Orange Apple Apple Orange Orange
4 Robbin Apple Apple Apple Apple Apple Banana Banana
5 Sunny Banana Banana Apple Apple Apple Banana Banana
So , I want to count the most frequent Fruit for each person and add those value in new column.
For example.
Name Mon Tue Wed Thu Fri Sat Sun Max_Acc Count
1 John Apple Orange Apple Banana Apple Apple Orange Apple 4
2 Ricky Banana Apple Banana Banana Banana Banana Apple Banana 5
3 Alex Apple Orange Orange Apple Apple Orange Orange Orange 4
4 Robbin Apple Apple Apple Apple Apple Banana Banana Apple 5
5 Sunny Banana Banana Apple Apple Apple Banana Banana Banana 4
I am facing problem in finding rows. I can find Frequency in column by using table()
function.
>table(df$Mon)
Apple Banana
3 2
But here i want name of most frequent fruit in new column.
If we need the "Count" and "Names" corresponding to the max
"Count", we loop through the rows of the dataset (using apply
with MARGIN = 1
), use table
to get the frequency, extract the maximum value from it and the names
corresponding to the maximum value, rbind
it and cbind
with the original dataset.
cbind(df1, do.call(rbind, apply(df1[-1], 1, function(x) {
x1 <- table(x)
data.frame(Count = max(x1), Names=names(x1)[which.max(x1)])})))
# Name Mon Tue Wed Thu Fri Sat Sun Count Names
#1 John Apple Orange Apple Banana Apple Apple Orange 4 Apple
#2 Ricky Banana Apple Banana Banana Banana Banana Apple 5 Banana
#3 Alex Apple Orange Orange Apple Apple Orange Orange 4 Orange
#4 Robbin Apple Apple Apple Apple Apple Banana Banana 5 Apple
#5 Sunny Banana Banana Apple Apple Apple Banana Banana 4 Banana
Or we can use data.table
library(data.table)
setDT(df1)[, c("Names", "Count") := {tbl <- table(unlist(.SD))
.(names(tbl)[which.max(tbl)], max(tbl))}, by = Name]
Another approach would be to loop over all unique fruits as follows
fruits_unique <- unique(unlist(dat[-1]))
occurence <- sapply(fruits_unique, function(x) rowSums(dat[,-1] == x))
# Using this data to create the resulting columns
ind <- apply(occurence,1,which.max)
dat$Names <- fruits_unique[ind]
dat$count <- occurence[cbind(seq_along(ind), ind)]
Result:
Name Mon Tue Wed Thu Fri Sat Sun Names Count
1 John Apple Orange Apple Banana Apple Apple Orange Apple 4
2 Ricky Banana Apple Banana Banana Banana Banana Apple Banana 5
3 Alex Apple Orange Orange Apple Apple Orange Orange Orange 4
4 Robbin Apple Apple Apple Apple Apple Banana Banana Apple 5
5 Sunny Banana Banana Apple Apple Apple Banana Banana Banana 4