How would I go about using regx to return all characters between two brackets. Here is an example:
foobar['infoNeededHere']ddd
needs to return infoNeededHere
I found a regex to do it between curly brackets but all attempts at making it work with square brackets have failed. Here is that regex: (?<={)[^}]*(?=})
and here is my attempt to hack it
(?<=[)[^}]*(?=])
Final Solution:
import re
str = "foobar['InfoNeeded'],"
match = re.match(r"^.*\['(.*)'\].*$",str)
print match.group(1)
If there's only one of these
[.....]
tokens per line, then you don't need to use regular expressions at all:If there's more than one of these per line, then you'll need to modify Jarrod's regex
^.*\['(.*)'\].*$
to match multiple times per line, and to be non greedy. (Use the.*?
quantifier instead of the.*
quantifier.)^.*\['(.*)'\].*$
will match a line and capture what you want in a group.You have to escape the
[
and]
with\
The documentation at the rubular.com proof link will explain how the expression is formed.
If you're new to REG(gular) EX(pressions) you learn about them at Python Docs. Or, if you want a gentler introduction, you can check out the HOWTO. They use Perl-style syntax.
Regex
The expression that you need is
.*?\[(.*)\].*
. The group that you want will be\1
.-
.*?
:.
matches any character but a newline.*
is a meta-character and means Repeat this 0 or more times.?
makes the*
non-greedy, i.e.,.
will match up as few chars as possible before hitting a '['.-
\[
:\
escapes special meta-characters, which in this case, is[
. If we didn't do that,[
would do something very weird instead.-
(.*)
: Parenthesis 'groups' whatever is inside it and you can later retrieve the groups by their numeric IDs or names (if they're given one).-
\].*
: You should know enough by now to know what this means.Implementation
First, import the
re
module -- it's not a built-in -- to where-ever you want to use the expression.Then, use
re.search(regex_pattern, string_to_be_tested)
to search for the pattern in the string to be tested. This will return aMatchObject
which you can store to a temporary variable. You should then call it'sgroup()
method and pass 1 as an argument (to see the 'Group 1' we captured using parenthesis earlier). I should now look like:An Alternative
You can also use
findall()
to find all the non-overlapping matches by modifying the regex to(?>=\[).+?(?=\])
.-
(?<=\[)
:(?<=)
is called a look-behind assertion and checks for an expression preceding the actual match.-
.+?
:+
is just like*
except that it matches one or more repititions. It is made non-greedy by?
.-
(?=\])
:(?=)
is a look-ahead assertion and checks for an expression following the match w/o capturing it.Your code should now look like:
Note: Always use raw Python strings by adding an 'r' before the string (E.g.:
r'blah blah blah'
).10x for reading! I wrote this answer when there were no accepted ones yet, but by the time I finished it, 2 ore came up and one got accepted. :( x<