I have a vector of text strings containing smilies and a dictionary containing only the smilies.
A <- c("This :/ :/ :) ^^","is :/ ^^", "weird^^ :)")
B <- c(":)",":/","^^")
I would like to extract all matches of smilies for each text string including duplicates, so my output should look like this:
[[1]]
[1] ":/" ":/" ":)" "^^"
[[2]]
[1] ":/" "^^"
[[3]]
[1] "^^" ":)"
This is what I tried so far:
# does not return duplicates
sapply(A, function(x) B[str_detect(x, fixed(B))], USE.NAMES = FALSE)
[[1]]
[1] ":)" ":/" "^^"
[[2]]
[1] ":/" "^^"
[[3]]
[1] ":)" "^^"
# Only returns first instance
str_extract_all(A,fixed(B))
[[1]]
[1] ":)"
[[2]]
[1] ":/"
[[3]]
[1] "^^"
# returns error because of unescaped characters
rm_default(A,pattern=B,fixed=TRUE,extract=TRUE)
Error in stringi::stri_extract_all_regex(text.var, pattern) :
Incorrectly nested parentheses in regexp pattern. (U_REGEX_MISMATCHED_PAREN)
In addition: Warning messages:
1: In if (substring(pattern, 1, 4) == "@rm_") { :
the condition has length > 1 and only the first element will be used
2: In if (substring(pattern, 1, 1) == "@") { :
the condition has length > 1 and only the first element will be used
Any help is much appreciated.