the question seems totally trivial but I cannot figure out why it isn't working. I simply want to replace a character variable involving a "+" operator with a single value excluding the "+" operator. For some reason gsub() and sub() function replace the number value but keep the operator. Any hint on how this can be overcome?
Many thanks!
data <- c(1,2,3,4,"5+")
gsub(pattern="5+",replacement="5",x=data)
#[1] "1" "2" "3" "4" "5+"
gsub(pattern="5+",replacement="",x=data)
#[1] "1" "2" "3" "4" "+"
R 3.0.2
+
is a metacharacter, and needs to be escaped when you want to match it:
gsub(pattern="5\\+",replacement="5",x=data)
#[1] "1" "2" "3" "4" "5"
Or more generally, if you want to remove the +
:
gsub(pattern="\\+",replacement="",x=data)
If unescaped, +
means "The preceding item will be matched one or more times", so in your second example, the "5"
element of "5+"
is matched by the pattern, and replaced by ""
, leaving you with "+"
.
Use fixed=TRUE
option:
gsub(pattern="+", replacement="", fixed=TRUE, c(1,2,3,4,"5+"))
You can also use strsplit
:
as.numeric(strsplit(data, "\\+"))
# [1] 1 2 3 4 5