I'm trying to count the frequency of a specific value in every column.
Basically, I am looking at how different bacterial isolates (represented by each row) respond to treatment with different antibiotics (represented each column). "1" means the isolate is resistant to the antibiotic, while "0" means the isolate is susceptible to the antibiotic.
antibiotic1 <- c(1, 1, 0, 1, 0, 1, NA, 0, 1)
antibiotic2 <- c(0, 0, NA, 0, 1, 1, 0, 0, 0)
antibiotic3 <- c(0, 1, 1, 0, 0, NA, 1, 0, 0)
ab <- data.frame(antibiotic1, antibiotic2, antibiotic3)
ab
antibiotic1 antibiotic2 antibiotic3
1 1 0 0
2 1 0 1
3 0 NA 1
4 1 0 0
5 0 1 0
6 1 1 NA
7 NA 0 1
8 0 0 0
9 1 0 0
So looking at the first row, isolate 1 is resistant to antibiotic 1, sensitive to antibiotic 2, and sensitive to antibiotic 3.
I want to calculate the % of isolates resistant to each antibiotic. i.e. sum the number of "1"s in each column and divide by the number of isolates in each column (excluding NAs from my denominator).
I know how to get counts:
apply(ab, 2, count)
$antibiotic1
x freq
1 0 3
2 1 5
3 NA 1
$antibiotic2
x freq
1 0 6
2 1 2
3 NA 1
$antibiotic3
x freq
1 0 5
2 1 3
3 NA 1
But my actual dataset contains many different antibiotics and hundreds of isolates, so I want to be able to run a function across all columns at the same time to yield a dataframe.
I've tried
counts <- ldply(ab, function(x) sum(x=="1")/(sum(x=="1") + sum(x=="0")))
but that yields NAs:
.id V1
1 antibiotic1 NA
2 antibiotic2 NA
3 antibiotic3 NA
I've also tried:
library(dplyr)
ab %>%
summarise_each(n = n())) %>%
mutate(prop.resis = n/sum(n))
but get an error message that reads:
Error in n() : This function should not be called directly
Any advice would be much appreciated.