How to add the results of applying a function to a

2020-04-21 05:43发布

I am trying to calculate the confidence intervals of some rates. I am using tidyverse and epitools to calculate CI from Byar's method.

I am almost certainly doing something wrong.

library (tidyverse)
library (epitools)


# here's my made up data

DISEASE = c("Marco Polio","Marco Polio","Marco Polio","Marco Polio","Marco Polio",
            "Mumps","Mumps","Mumps","Mumps","Mumps",
            "Chicky Pox","Chicky Pox","Chicky Pox","Chicky Pox","Chicky Pox")
YEAR = c(2011, 2012, 2013, 2014, 2015,
         2011, 2012, 2013, 2014, 2015,
         2011, 2012, 2013, 2014, 2015)
VALUE = c(82,89,79,51,51,
          79,91,69,89,78,
          71,69,95,61,87)
AREA =c("A", "B","C")

DATA = data.frame(DISEASE, YEAR, VALUE,AREA)


# this is a simplification, I have the population values in another table, which I've merged 
# to give me the dataframe I then apply pois.byar to.
DATA$POPN = ifelse(DATA$AREA == "A",2.5,
              ifelse(DATA$AREA == "B",3,
                     ifelse(DATA$AREA == "C",7,0)))


# this bit calculates the number of things per area
rates<-DATA%>%group_by(DISEASE,AREA,POPN)%>%
  count(AREA)

Then if I want to calculate CI I thought this would work

rates<-DATA%>%group_by(DISEASE,AREA,POPN)%>%
  count(AREA) %>%
  mutate(pois.byar(rates$n,rates$POPN))

but i get

Error in mutate_impl(.data, dots) : 
  Evaluation error: arguments imply differing number of rows: 0, 1.

This however works:

pois.byar(rates$n,rates$POPN)

It seems daft to then say: "turn the results of the pois.byar function into a dataframe and then merge back to the original". I may have tried that just to get some data.... I don't want to do that. It's not the right way to do things.

Any advice gratefully received. I think it's a fairly basic problem. And indicative of my not sitting and learning but trying to do things as I go.

Here's what I want Disease Year n area popn x pt rate lower upper conf.level

1条回答
贼婆χ
2楼-- · 2020-04-21 06:19

It's not clear to me what your expected output is supposed to me. Your comment does not really help. Best to explicitly include your expected output for the sample data you give.

The issue here is that pois.byvar returns a data.frame. So in order for mutate to be able to use the output of pois.byvar we need to store the data.frames in a list.

Here is a tidier version of your code

library(tidyverse)
DATA %>%
    mutate(POPN = case_when(
        AREA == "A" ~ 2.5,
        AREA == "B" ~ 3,
        AREA == "C" ~ 7,
        TRUE ~ 0)) %>%
    group_by(DISEASE,AREA,POPN) %>%
    count(AREA) %>%
    mutate(res = list(pois.byar(n, POPN)))

This creates a column res which contains the data.frame output of pois.byar.

Or perhaps you'd like to unnest the list column to expand entries into different columns?

library(tidyverse)
DATA %>%
    mutate(POPN = case_when(
        AREA == "A" ~ 2.5,
        AREA == "B" ~ 3,
        AREA == "C" ~ 7,
        TRUE ~ 0)) %>%
    group_by(DISEASE,AREA,POPN) %>%
    count(AREA) %>%
    mutate(res = list(pois.byar(n, POPN))) %>%
    unnest()
## A tibble: 9 x 10
## Groups:   DISEASE, AREA, POPN [9]
#  DISEASE     AREA   POPN     n     x    pt  rate  lower upper conf.level
#  <fct>       <fct> <dbl> <int> <int> <dbl> <dbl>  <dbl> <dbl>      <dbl>
#1 Chicky Pox  A       2.5     1     1   2.5 0.4   0.0363 1.86        0.95
#2 Chicky Pox  B       3       2     2   3   0.667 0.133  2.14        0.95
#3 Chicky Pox  C       7       2     2   7   0.286 0.0570 0.916       0.95
#4 Marco Polio A       2.5     2     2   2.5 0.8   0.160  2.56        0.95
#5 Marco Polio B       3       2     2   3   0.667 0.133  2.14        0.95
#6 Marco Polio C       7       1     1   7   0.143 0.0130 0.666       0.95
#7 Mumps       A       2.5     2     2   2.5 0.8   0.160  2.56        0.95
#8 Mumps       B       3       1     1   3   0.333 0.0302 1.55        0.95
#9 Mumps       C       7       2     2   7   0.286 0.0570 0.916       0.95
查看更多
登录 后发表回答