Replacing data values based on grep result in R

2020-07-18 03:17发布

问题:

I have a data frame. One of the columns has values like:

WIND
WINDS
HIGH WIND
etc

among the other values. Now I want to rename every value that has some variation of "WIND" in it, with "WIND". I know how to find values that I need to replace:

grep("WIND", df$col1)

but not how to replace those values. Thanks.

回答1:

You can just subset the original column for these values by using grepl and replace

df$col1[grepl("WIND",df$col1)]<-"WIND"


回答2:

UPDATE: a bit of a brainfart, agrep actually doesn't add anything here over grep, but you can just replace the agrep with grep. It does if you have some words that have roots that vary slightly but you still want to match.

Here is an approach using agrep:

> wind.vec
[1] "WINDS"      "HIGH WIND"  "WINDY"      "VERY WINDY"
> wind.vec[agrep("WIND", wind.vec)] <- "WIND"
> wind.vec
[1] "WIND" "WIND" "WIND" "WIND"

The nice thing about agrep is it matches approximately, so "WINDY" is replaced. Note I'm doing this with a vector, but you can easily extend to a data frame by replacing wind.vec with my.data.frame$my.wind.col.

agrep returns the indices that match approximately, which then allows me to use the [<- replacement operator to replace the approximately matching values with "WIND".



标签: r grep