I'm trying to remove all characters preceding the first instance of a capital letter for each string in a vector of strings:
x <- c(" its client Auto Group", "itself and Phone Company", ", client Large Bank")
I've tried:
sub('.*?[A-Z]', '', x)
But that returns:
"uto Group" "hone Company" "arge Bank"
I need it to return:
"Auto Group" "Phone Company" "Large Bank"
Any ideas?
Thanks.
You need to use a capturing group with a backreference:
Here,
^
- start of the string.*?
- any 0+ characters as few as possible([A-Z])
- Capture group 1 capturing an uppercase ASCII letter that will be referenced with\1
in the replacement pattern.So, what we restore what we captured in the result with a backreference.