Does R have function startswith or endswith like p

2019-01-11 15:43发布

问题:

Was looking for predictors whose name starts with some substring, could not find any similar function.

回答1:

As added to base in 3.3.0, startsWith (and endsWith) are exactly this.

> startsWith("what", "wha")
[1] TRUE
> startsWith("what", "ha")
[1] FALSE

https://stat.ethz.ch/R-manual/R-devel/library/base/html/startsWith.html



回答2:

Not inbuilt like that.

Options include grepl and substr.

x <- 'ABCDE'
grepl('^AB', x) # starts with AB?
grepl('DE$', x) # ends with DE?
substr(x, 1, 2) == 'AB'
substr('ABCDE', nchar(x)-1, nchar(x)) == 'DE'


回答3:

The dplyr package's select statement supports starts_with and ends_with. For example, this selects the columns of the iris data frame that start with Petal

library(dplyr)
select(iris, starts_with("Petal"))

select supports other subcommands too. Try ?select .



回答4:

The simplest way I can think of is to use the %like% operator:

library(data.table)

"foo" %like% "^f" 

evaluates as TRUE - Starting with f

"foo" %like% "o$" 

evaluates as TRUE - Ending with o

"bar" %like% "a"

evaluates as TRUE - Containing a



回答5:

Borrowing some code from the dplyr package [see this] you could do something like this:

starts_with <- function(vars, match, ignore.case = TRUE) {
  if (ignore.case) match <- tolower(match)
  n <- nchar(match)

  if (ignore.case) vars <- tolower(vars)
  substr(vars, 1, n) == match
}

ends_with <- function(vars, match, ignore.case = TRUE) {
  if (ignore.case) match <- tolower(match)
  n <- nchar(match)

  if (ignore.case) vars <- tolower(vars)
  length <- nchar(vars)

  substr(vars, pmax(1, length - n + 1), length) == match
}


回答6:

This is relatively simple by using the substring function:

> strings = c("abc", "bcd", "def", "ghi", "xyzzd", "a")
> str_to_find = "de"
> substring(strings, 1, nchar(str_to_find)) == str_to_find
[1] FALSE FALSE  TRUE FALSE FALSE FALSE

You cut each string to the desired length with substring. The length being the number of characters you are looking for at the beginning of each string.