I am trying to parse the following string and return all digits after the last square bracket:
C9: Title of object (foo, bar) [ch1, CH12,c03,4]
So the result should be:
1,12,03,4
The string and digits will change. The important thing is to get the digits after the '[' regardless of what character (if any) precede it.
(I need this in python so no atomic groups either!)
I have tried everything I can think of including:
\[.*?(\d) = matches '1' only
\[.*(\d) = matches '4' only
\[*?(\d) = matches include '9' from the beginning
etc
Any help is greatly appreciated!
EDIT:
I also need to do this without using str.split() too.
You can rather find all digits in the substring after the last [
bracket:
>>> s = 'C9: Title of object (fo[ 123o, bar) [ch1, CH12,c03,4]'
>>> # Get substring after the last '['.
>>> target_string = s.rsplit('[', 1)[1]
>>>
>>> re.findall(r'\d+', target_string)
['1', '12', '03', '4']
If you can't use split, then this one would work with look-ahead assertion:
>>> s = 'C9: Title of object (fo[ 123o, bar) [ch1, CH12,c03,4]'
>>> re.findall(r'\d+(?=[^[]+$)', s)
['1', '12', '03', '4']
This finds all digits, which are followed by only non-[
characters till the end.
It may help to use the non-greedy ?
. For example:
\[.*?(\d*?),.*?(\d*?),.*?(\d*?),.*?(\d*?)\]
And, here's how it works (from https://regex101.com/r/jP7hM3/1):
"\[.*?(\d*?),.*?(\d*?),.*?(\d*?),.*?(\d*?)\]"
\[ matches the character [ literally
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
1st Capturing group (\d*?)
\d*? match a digit [0-9]
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
, matches the character , literally
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
2nd Capturing group (\d*?)
\d*? match a digit [0-9]
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
, matches the character , literally
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
3rd Capturing group (\d*?)
\d*? match a digit [0-9]
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
, matches the character , literally
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
4th Capturing group (\d*?)
\d*? match a digit [0-9]
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
\] matches the character ] literally
Although - I have to agree with others... This is a regex solution, but its not a very pythonic solution.