purrr pmap to read max column name by column name

2019-08-23 02:27发布

问题:

I have this dataset:

library(dpylr)
Problem<- tibble(name = c("Angela", "Claire", "Justin", "Bob", "Gil"),
                   status_1 = c("Registered", "No Action", "Completed", "Denied", "No Action"),
                   status_2 = c("Withdrawn", "No Action", "Registered", "No Action", "Exempt"),
                   status_3 = c("No Action", "Registered", "Withdrawn", "No Action", "No Action"))

I want to make a column that has everyone's current status. If the person has ever completed the course, they are completed. If they were ever exempt, they are excluded. If they are anything else other than registered (or completed or exempt), they are "Not Taken." What's hard is that I want my code to say they were registered ONLY if their last action was being registered. So, it should look like this:

library(dplyr)
solution <- tibble(name = c("Angela", "Claire", "Justin", "Bob", "Gil"),
                   status_1 = c("Registered", "No Action", "Completed", "Denied", "No Action"),
                   status_2 = c("Withdrawn", "No Action", "Registered", "No Action", "Exempt"),
                   status_3 = c("No Action", "Registered", "Withdrawn", "No Action", "No Action"),
                   current = c("Not Taken", "Registered", "Completed", "Not Taken", "Exempt")

I have this code, and the part that won't work is the which.max() line:

library(dplyr)
library(purrr)
library(stringr)
problem %>% 
  mutate(
    current =
      pmap_chr(select(., contains("status")), ~
        case_when(
          any(str_detect(c(...), "(?i)Completed")) ~ "Completed",
          any(str_detect(c(...), "(?i)Exempt")) | any(str_detect(c(...), "(?i)Incomplete")) ~ "Exclude",
          which.max(parse_number(colnames(.)) == "Registered") ~ "Registered",
          any(str_detect(c(...), "(?i)No Show")) | any(str_detect(c(...), "(?i)Denied")) | any(str_detect(c(...), "(?i)Cancelled")) | any(str_detect(c(...), "(?i)Waitlist Expired")) || any(str_detect(c(...), "(?i)Withdrawn")) ~ "Not Taken",
          TRUE ~ "NA"
        )
      )
  )

I've tried every way for R to read the numbers of status, but I can't figure it out. It's important that I keep the rest of the code, especially the str_detect() portion because, while my sample data is clean, the real dataset has many rows of status and many entries that look like "COMPLETED" and "completed."

Why can I not look at purrr with parse number to have it read the max status?

Thank you!

回答1:

Keeping everything as it is and dealing only with your which.max issue, we can do

library(tidyverse)

Problem %>% 
    mutate(
       current =
         pmap_chr(select(., contains("status")), ~
             case_when(
               any(str_detect(c(...), "(?i)Completed")) ~ "Completed",
               any(str_detect(c(...), "(?i)Exempt")) | any(str_detect(c(...), "(?i)Incomplete")) ~ "Exclude",
               which.max(c(...) == "Registered") == length(c(...)) ~ "Registered",
               any(str_detect(c(...), "(?i)No Show")) | any(str_detect(c(...), "(?i)Denied")) | any(str_detect(c(...), "(?i)Cancelled")) | any(str_detect(c(...), "(?i)Waitlist Expired")) || any(str_detect(c(...), "(?i)Withdrawn")) ~ "Not Taken",
               TRUE ~ "NA"
             )
            )
       )

# name   status_1   status_2   status_3   current   
#  <chr>  <chr>      <chr>      <chr>      <chr>     
#1 Angela Registered Withdrawn  No Action  Not Taken 
#2 Claire No Action  No Action  Registered Registered
#3 Justin Completed  Registered Withdrawn  Completed 
#4 Bob    Denied     No Action  No Action  Not Taken 
#5 Gil    No Action  Exempt     No Action  Exempt