I'm still a little confused by regex syntax. Can you please help me with these patterns:
_A00_A1234B_
_A00_A12345B_
_A1_A12345_
my approaches so far:
vapply(strsplit(files, "[_.]"), function(files) files[nchar(files) == 7][1], character(1))
or
str_extract(str2, "[A-Z][0-9]{5}[A-Z]")
The expected outputs are
A1234B
A12345B
A12345
Thanks!
You can do this without using a regular expression ...
If you insist on using a regular expression, the following will suffice.
Using rex to construct the regular expression may make it more understandable.
You can use
sub
and this regex:You can try
Here, the pattern looks for a capital letter
[A-Z]
, followed by4
or 5 digits[0-9]{4,5}
, followed by a capital letter[A-Z]
?
Or you can use
stringi
which would be fasterOr a
base R
option would bedata