library(tidyr)
library(dplyr)
library(tidyverse)
Below is the code for a simple dataframe. I have some messy data that was exported with column factor categories spread out in different columns.
Client<-c("Client1","Client2","Client3","Client4","Client5")
Sex_M<-c("Male","NA","Male","NA","Male")
Sex_F<-c(" ","Female"," ","Female"," ")
Satisfaction_Satisfied<-c("Satisfied"," "," ","Satisfied","Satisfied")
Satisfaction_VerySatisfied<-c(" ","VerySatisfied","VerySatisfied"," "," ")
CommunicationType_Email<-c("Email"," "," ","Email","Email")
CommunicationType_Phone<-c(" ","Phone ","Phone "," "," ")
DF<-tibble(Client,Sex_M,Sex_F,Satisfaction_Satisfied,Satisfaction_VerySatisfied,CommunicationType_Email,CommunicationType_Phone)
I want to recombine the categories into single columns using tidyr's "unite".
DF<-DF%>%unite(Sat,Satisfaction_Satisfied,Satisfaction_VerySatisfied,sep=" ")%>%
unite(Sex,Sex_M,Sex_F,sep=" ")
However, I have to write multiple "unite" lines and I feel this violates the three times rule, so there must be a way to make this easier, especially since my real data contains dozens of columns that need to be combined. Is there a way to use "unite" once but somehow refer to matching column names so that all column names that are similar (For example, containing "Sex" for "Sex_M" and "Sex_F", and "CommunicationType" for "CommunicationType_Email" and "CommunicationType_Phone") are combined with the above formula?
I was also thinking about a function that allows me to enter column names, but this is too difficult for me since it involves complex standard evaluation.