Get the strings before the comma with R

2020-02-26 05:56发布

I am a beginner with R. Now, I have a vector in a data.frame like this

city
Kirkland,
Bethesda,
Wellington,
La Jolla,
Berkeley,
Costa, Evie KW172NJ
Miami,
Plano,
Sacramento,
Middletown,
Webster,
Houston,
Denver,
Kirkland,
Pinecrest,
Tarzana,
Boulder,
Westfield,
Fair Haven,
Royal Palm Beach, Fl
Westport,
Encino,
Oak Ridge,

I want to clean it. What I want is all the city names before the comma. How can I get the result in R? Thanks!

5条回答
▲ chillily
2楼-- · 2020-02-26 06:25

This works as well:

x <- c("London, UK", "Paris, France", "New York, USA")

library(qdap)
beg2char(x, ",")

## > beg2char(x, ",")
## [1] "London"   "Paris"    "New York"
查看更多
我只想做你的唯一
3楼-- · 2020-02-26 06:43

If the this was a column in a dataframe, we can use tidyverse.

library(dplyr)
x <- c("London, UK", "Paris, France", "New York, USA")
x <- as.data.frame(x)
x %>% separate(x, c("A","B"), sep = ',')
        A       B
1   London      UK
2    Paris  France
3 New York     USA
查看更多
祖国的老花朵
4楼-- · 2020-02-26 06:44

Just for fun, you can use strsplit

> x <- c("London, UK", "Paris, France", "New York, USA")
> sapply(strsplit(x, ","), "[", 1)
[1] "London"   "Paris"    "New York"
查看更多
手持菜刀,她持情操
5楼-- · 2020-02-26 06:48

You can use gsub with a bit of regexp :

cities <- gsub("^(.*?),.*", "\\1", df$city)

This one works, too :

cities <- gsub(",.*$", "", df$city)
查看更多
相关推荐>>
6楼-- · 2020-02-26 06:50

You could use regexpr to find the position of the first comma in each element and use substr to snip them at this:

x <- c("London, UK", "Paris, France", "New York, USA")

substr(x,1,regexpr(",",x)-1)
[1] "London"   "Paris"    "New York"
查看更多
登录 后发表回答