I want to match the docstrings of a Python file. Eg.
r""" Hello this is Foo
"""
Using only """
should be enough for the start.
>>> data = 'r""" Hello this is Foo\n """'
>>> def display(m):
... if not m:
... return None
... else:
... return '<Match: %r, groups=%r>' % (m.group(), m.groups())
...
>>> import re
>>> print display(re.match('r?"""(.*?)"""', data, re.S))
<Match: 'r""" Hello this is Foo\n """', groups=(' Hello this is Foo\n ',)>
>>> print display(re.match('r?(""")(.*?)\1', data, re.S))
None
Can someone please explain to me why the first expression matches and the other does not?
I think you might be missing the
re.DOTALL
orre.MULTILINE
flags. In this case are.DOTALL
should allow your regex.*?
to match newlines as wellYou are using the escape sequence
\1
instead of the backreference\1
.You can fix this by changing to escaping the
\
before1
.You can also fix it by using a raw string for your regex, with no escape sequences.