Let's say, I have a string:
> my_string = '{foo}/{bar}'
> my_string.format(foo='foo', bar='bar')
'foo/bar'
Right, cool. But in my case, I want to retrieve which are the keywords arguments in my_string
. I have done:
> ATTRS_PATTERN = re.compile(r'{(?P<variable>[_a-z][_a-z0-9]*)}')
> ATTRS_PATTERN.findall(my_string)
['foo', 'bar']
It's not very sexy. Do you have any better idea ?
Why reinvent the wheel? string.Formatter
has the parse() function.
>>> import string
>>> [a[1] for a in string.Formatter().parse('{foo}/{bar}')]
['foo', 'bar']
You can use the string.Formatter.parse
method. It splits the string into its literal text components and fields:
In [1]: import string
In [2]: formatter = string.Formatter()
In [3]: text = 'Here is some text with {replacement} fields {}'
In [4]: list(formatter.parse(text))
Out[4]:
[('Here is some text with ', 'replacement', '', None),
(' fields ', '', '', None)]
To retrieve the names fields simply iterate over the result and collect the second field.
Note that this will include positional (both numbered and unnumbered) arguments as well.
Note that this does not include nested arguments:
In [1]: import string
In [2]: formatter = string.Formatter()
In [3]: list(formatter.parse('{hello:{world}}'))
Out[3]: [('', 'hello', '{world}', None)]
If you want to get all named fields (assuming only named fields are used), you have to parse the second element in the tuple:
In [4]: def get_named_fields(text):
...: formatter = string.Formatter()
...: elems = formatter.parse(text)
...: for _, field, spec, _ in elems:
...: if field:
...: yield field
...: if spec:
...: yield from get_named_fields(spec)
...:
In [5]: list(get_named_fields('{hello:{world}}'))
Out[5]: ['hello', 'world']
(This solution would allow arbitrarily deep format specifiers, while only one level would be sufficient).