How can I extract from title from name in a column

2019-09-05 07:05发布

问题:

I have a column of names of the form "Hobs, Mr. jack" i.e. lastname, title. firstname. title has 4 types -"Mr.", "Mrs.","Miss.","Master." How can I search for each item in the column & return the title ,which I can store in another column ?

Name <- c("Hobs, Mr. jack","Hobs, Master. John","Hobs, Mrs. Nicole",........)

desired output - a column "title" with values - ("Mr","Master", "Mrs",.....)

I have tried something like this:

f <- function(d) {
      if (grep("Mr", d$title)) {
                  gsub("$Mr$", "Mr", d$title, ignore.case = T)
           }
 }

no success >.<

回答1:

Maybe something like this:

library(stringr)
> Name <- c("Hobs, Mr. jack","Hobs, Master. John","Hobs, Mrs. Nicole")
> str_extract(string = Name,pattern = "(Mr|Master|Mrs)\\.")
[1] "Mr."     "Master." "Mrs."   

A fancier regex might exclude the period up front, or you could remove them in a second step.



回答2:

Considering dataset name as df and column as Name. New column name would be title.

df$Title <- gsub('(.*, )|(\\..*)', '', df$Name)