问题:

I'm trying to do something but can't remember/find the answer. I have a list of city names from the Census Bureau and they put the city's type on the end which is messing up my match().

I'd like to make this:

Middletown Township
Sunny Valley Borough
Hillside Village

into this:

Middletown
Sunny Valley
Hillside

Any suggestions? Ideally I'd also like to know if there's a lastIndexOf() function in R.

Here's the dput:

> dput(df1)
structure(list(id = c(1, 2, 3), city = structure(c(2L, 3L, 1L
), .Label = c("Hillside Village", "Middletown Township", "Sunny Valley Borough"
), class = "factor")), .Names = c("id", "city"), row.names = c(NA, 
-3L), class = "data.frame")

回答1:

This will work:

gsub("\\s*\\w*$", "", df1$city)
[1] "Middletown"   "Sunny Valley" "Hillside"

It removes any substring consisting of one or more space chararacters, followed by any number of "word" characters (spaces, numbers, or underscores), followed by the end of the string.

回答2:

Here's a regexp that does what you need:

sub(df1$city, pattern = " [[:alpha:]]*$", replacement = "")

[1] "Middletown" "Sunny Valley" "Hillside"

That's replacing a substring that starts with a space, then contains only letters until the end of the string, with an empty string.

R remove last word from string

问题:

回答1:

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮