Turning string with embedded brackets into a dicti

2020-08-13 05:31发布

What's the best way to build a dictionary from a string like the one below:

"{key1 value1} {key2 value2} {key3 {value with spaces}}"

So the key is always a string with no spaces but the value is either a string or a string in curly brackets (it has spaces)?

How would you dict it into:

{'key1': 'value1',   'key2': 'value2',   'key3': 'value with spaces'}

4条回答
虎瘦雄心在
2楼-- · 2020-08-13 05:40
import re
x="{key1 value1} {key2 value2} {key3 {value with spaces}}"
print dict(re.findall(r"\{(\S+)\s+\{*(.*?)\}+",x))

You can try this.

Output:

{'key3': 'value with spaces', 'key2': 'value2', 'key1': 'value1'}

Here with re.findall we extract key and its value.re.findall returns a list with tuples of all key,value pairs.Using dict on list of tuples provides the final answer. Read more here.

查看更多
ゆ 、 Hurt°
3楼-- · 2020-08-13 05:41

Assuming that you don't have anything in your string more nested than what is in your example, you could first use lookahead/lookbehind assertions to split the string into your key-value pairs, looking for the pattern } { (the end of one pair of brackets and the beginning of another.)

>>> str = '{key1 value1} {key2 value2} {key3 {value with spaces}}'
>>> pairs = re.split('(?<=})\s*(?={)', str)

This says "Match on any \s* (whitespace) that has a } before it and a { after it, but don't include those brackets in the match itself."

Then you have your key-value pairs:

>>> pairs
['{key1 value1}', '{key2 value2}', '{key3 {value with spaces}}']

which can be split on whitespace with the maxsplit parameter set to 1, to make sure that it only splits on the first space. In this example I have also used string indexing (the [1:-1]) to get rid of the curly braces that I know are at the beginning and end of each pair.

>>> simple = pairs[0] 
>>> complex = pairs[2]  
>>> simple
'{key1 value1}'
>>> complex
'{key3 {value with spaces}}'
>>> simple[1:-1]
'key1 value1'
>>> kv = re.split('\s+', simple[1:-1], maxsplit=1)
>>> kv
['key1', 'value1']
>>> kv3 = re.split('\s+', complex[1:-1], maxsplit=1)
>>> kv3
['key3', '{value with spaces}']

then just check whether the value is enclosed in curly braces, and remove them if you need to before putting them into your dictionary.

If it is guaranteed that the key/value pairs will always be separated by a single space character, then you could use plain old string split instead.

>>> kv3 = complex[1:-1].split(' ', maxsplit=1)
>>> kv3
['key3', '{value with spaces}']
查看更多
Deceive 欺骗
4楼-- · 2020-08-13 05:50

The answer by @vks doesn't check for balanced braces. Try the following:

>>> x="{key3 {value with spaces} {key4 value4}}"
>>> dict(re.findall(r"\{(\S+)\s+\{*(.*?)\}+",x))
{'key3': 'value with spaces', 'key4': 'value4'}

Try instead:

>>> dict(map(lambda x:[x[0],x[2]], re.findall(r'\{(\S+)\s+(?P<Brace>\{)?((?(Brace)[^{}]*|[^{}\s]*))(?(Brace)\})\}',x)))
{'key4': 'value4'}

that is, it matches only on the part with correct bracing.

The (?P<Brace>\{) saves the match of a {, and later (?(Brace)\})will match } only if the first one matched, and so braces must come in matching pairs. And by the (?(Brace)...|...) construct, if \Brace matched, the value part can contain anything except braces ([^{}]*), else no space is allowed ([^{}\s]*).

As the optional brace is matched in the regexp, and thus returned in the list, we need to extract element 0 and 2 from each list by the map() function.

Regexps easily gets messy.

查看更多
Rolldiameter
5楼-- · 2020-08-13 06:04

I can´t make it more elegantly:

input = "{key1 value1} {key2 value2} {key3 {value with spaces}}"
x = input.split("} {")             # creates list with keys and values
y = [i.split(" {") for i in y]     # separates the list-values from keys
# create final list with separated keys and values, removing brackets
z = [[i.translate(None,"{").translate(None,"}").split() for i in j] for j in y]

fin = {}
for i in z:
    fin[i[0][0]] = i[-1]

It´s very hacky, but it should do the job.

查看更多
登录 后发表回答