string - Get the strings before the comma with R

Question

I am a beginner with R. Now, I have a vector in a data.frame like this

city
Kirkland,
Bethesda,
Wellington,
La Jolla,
Berkeley,
Costa, Evie KW172NJ
Miami,
Plano,
Sacramento,
Middletown,
Webster,
Houston,
Denver,
Kirkland,
Pinecrest,
Tarzana,
Boulder,
Westfield,
Fair Haven,
Royal Palm Beach, Fl
Westport,
Encino,
Oak Ridge,

I want to clean it. What I want is all the city names before the comma. How can I get the result in R? Thanks!

score 21 · Accepted Answer

您可以使用gsub一些 regexp ：

cities <- gsub("^(.*?),.*", "\\1", df$city)

这个也有效：

cities <- gsub(",.*$", "", df$city)

score 4 · Accepted Answer

只是为了好玩，您可以使用strsplit

> x <- c("London, UK", "Paris, France", "New York, USA")
> sapply(strsplit(x, ","), "[", 1)
[1] "London"   "Paris"    "New York"

score 4 · Accepted Answer

您可以使用regexpr来查找每个元素中第一个逗号的位置，并用于substr在此剪断它们：

x <- c("London, UK", "Paris, France", "New York, USA")

substr(x,1,regexpr(",",x)-1)
[1] "London"   "Paris"    "New York"

score 2 · Accepted Answer

这也有效：

x <- c("London, UK", "Paris, France", "New York, USA")

library(qdap)
beg2char(x, ",")

## > beg2char(x, ",")
## [1] "London"   "Paris"    "New York"

score 2 · Accepted Answer

如果这是数据框中的一列，我们可以使用 tidyverse。

library(dplyr)
x <- c("London, UK", "Paris, France", "New York, USA")
x <- as.data.frame(x)
x %>% separate(x, c("A","B"), sep = ',')
        A       B
1   London      UK
2    Paris  France
3 New York     USA

string - Get the strings before the comma with R

5 回答 5

Related

Reference