I have a string that looks like:
str<-"a\f\r"
I'm trying to remove the backslashes but nothing works:
gsub("\","",str, fixed=TRUE)
gsub("\\","",str)
gsub("(\)","",str)
gsub("([\])","",str)
...basically all the variations you can imagine. I have even tried the string_replace_all
function. ANY HELP??
I'm using R version 3.1.1; Mac OSX 10.7; the dput
for a single string in my vector of strings gives:
dput(line)
"ud83d\ude21\ud83d\udd2b"
I imported the file using
readLines
from a standard
.txt
file. The content of the file looks something like:
got an engineer booked for this afternoon \ud83d\udc4d all now hopefully sorted\ud83d\ude0a I m going to go insane ud83d\ude21\ud83d\udd2b in utf8towcs …
Thanks.
This is the same as the accepted answer but rtemoves less (just non-ascii characters):
Since there isn't any direct ways to dealing with single backslashes, here's the closest solution to the problem as provided by David Arenburg in the comments section
When inputting backslashes from the keyboard, always escape them.
Note that if you do
then
str
contains no backslashes. It consists of the 3 charactersa
,\f
(which is not normally printable, except as\f
, and\r
(same).And just to head off a possible question. If your data was read from a file, the file doesn't have to have doubled backslashes. For example, if you have a file
test.txt
containingand you do
then
str
will contain the stringa\b\c\d\e\f
as you'd expect: 6 letters separated by 5 single backslashes. But you still have to type doubled backslashes if you want to work with it.From the
dput
, it looks like what you've got there is UTF-16 encoded text, which probably came from a Windows machine. According toit encodes glyphs in the Supplementary Multilingual Plane, which is pretty obscure. I'll guess that you need to supply the argument
encoding="UTF-16"
toreadLines
when you read in the file.One quite universal solution is
Thanks to the comment above.
This might be helpful :)