I have a vector of numbers. Let's call it mydata
:
str(mydata)
# num [1:236] 2 1 1 2 2 1 2 1 2 2 ...
I can then count each value using table
:
table(mydata)
# mydata
# 1 2 9 10
# 20 200 14 2
Now, I want to select the value with the highest count (in this case, "2").
I can find the highest count (e.g. 200 in this case) by using the
max function: max(table(mydata))
. But how to get the name associated with the max count in the table, i.e. "2"?
A table
is very much like a list or a data frame, in that it has values and names (attributes) that are accessible through vector subsetting.
> mydata <- c(rep(1, 20), rep(2, 200), rep(9, 14), rep(10, 2))
> tab <- table(mydata)
> tab
## mydata
## 1 2 9 10
## 20 200 14 2
> names(tab)
## [1] "1" "2" "9" "10"
> c(val = names(tab)[tab == max(tab)], freq = max(tab))
## val freq
## "2" "200"
The following are equivalent
> tab[ names(tab)[tab == max(tab)] ]
## 2
## 200
> tab["2"]
## 2
## 200
Other useful things to know about an object is described in its attributes
> attributes(tab)
$dim
[1] 4
$dimnames
$dimnames$mydata
[1] "1" "2" "9" "10"
$class
[1] "table"
I'd probably do this
tab<-table(mydata)
names(tab)[which.max(tab)]
That will return "2" as a string. You can do as.numeric() if you want to get it back to a number. This one-liner is a bit more ugly and probably less efficient, but hey, it's one line.
sapply(list(table(mydata)), function(x) names(x[which.max(x)]))
or maybe
with(as.data.frame(table(data)), data[which.max(Freq)])
which will actually return a factor with a value of "2". If you want to make that numeric, you need to do as.numeric(as.character(x))
. I was just trying to find ways to avoid having a table variable lying around if i really didn't need it. I wish there were an easier way to convert a table to a named vector.