可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have malformed string:
a = '(a,1.0),(b,6.0),(c,10.0)'
I need dict
:
d = {'a':1.0, 'b':6.0, 'c':10.0}
I try:
print (ast.literal_eval(a))
#ValueError: malformed node or string: <_ast.Name object at 0x000000000F67E828>
Then I try replace chars to 'string dict'
, it is ugly and does not work:
b = a.replace(',(','|{').replace(',',' : ')
.replace('|',', ').replace('(','{').replace(')','}')
print (b)
{a : 1.0}, {b : 6.0}, {c : 10.0}
print (ast.literal_eval(b))
#ValueError: malformed node or string: <_ast.Name object at 0x000000000C2EA588>
What do you do? Something missing? Is possible use regex
?
回答1:
Given the string has the above stated format, you could use regex substitution with backrefs:
import re
a = '(a,1.0),(b,6.0),(c,10.0)'
a_fix = re.sub(r'\((\w+),', r"('\1',",a)
So you look for a pattern (x,
(with x
a sequence of \w
s and you substitute it into ('x',
. The result is then:
# result
a_fix == "('a',1.0),('b',6.0),('c',10.0)"
and then parse a_fix
and convert it to a dict
:
result = dict(ast.literal_eval(a_fix))
The result in then:
>>> dict(ast.literal_eval(a_fix))
{'b': 6.0, 'c': 10.0, 'a': 1.0}
回答2:
No need for regexes, if your string is in this format.
>>> a = '(a,1.0),(b,6.0),(c,10.0)'
>>> d = dict([x.split(',') for x in a[1:-1].split('),(')])
>>> print(d)
{'c': '10.0', 'a': '1.0', 'b': '6.0'}
We remove the first opening parantheses and last closing parantheses to get the key-value pairs by splitting on ),(
. The pairs can then be split on the comma.
To cast to float, the list comprehension gets a little longer:
d = dict([(a, float(b)) for (a, b) in [x.split(',') for x in a[1:-1].split('),(')]])
回答3:
If there are always 2 comma-separated values inside parentheses and the second is of a float type, you may use
import re
s = '(a,1.0),(b,6.0),(c,10.0)'
print(dict(map(lambda (w, m): (w, float(m)), [(x, y) for x, y in re.findall(r'\(([^),]+),([^)]*)\)', s) ])))
See the Python demo and the (quite generic) regex demo. This pattern just matches a (
, then 0+ chars other than a comma and )
capturing into Group 1, then a comma is matched, then any 0+ chars other than )
(captured into Group 2) and a )
.
As the pattern above is suitable when you have pre-validated data, the regex can be restricted for your current data as
r'\((\w+),(\d*\.?\d+)\)'
See the regex demo
Details:
\(
- a literal (
(\w+)
- Capturing group 1: one or more word (letter/digit/_
) chars
,
- a comma
(\d*\.?\d+)
- a common integer/float regex: zero or more digits, an optional .
(decimal separator) and 1+ digits
\)
- a literal closing parenthesis.
回答4:
the reason why eval()
dose not work is the a, b, c
are not defined, we can define those with it's string form and eval will get that string form to use
In [11]: text = '(a,1.0),(b,6.0),(c,10.0)'
In [12]: a, b, c = 'a', 'b', 'c'
In [13]: eval(text)
Out[13]: (('a', 1.0), ('b', 6.0), ('c', 10.0))
In [14]: dict(eval(text))
Out[14]: {'a': 1.0, 'b': 6.0, 'c': 10.0}
to do this in regex way:
In [21]: re.sub(r'\((.+?),', r'("\1",', text)
Out[21]: '("a",1.0),("b",6.0),("c",10.0)'
In [22]: eval(_)
Out[22]: (('a', 1.0), ('b', 6.0), ('c', 10.0))
In [23]: dict(_)
Out[23]: {'a': 1.0, 'b': 6.0, 'c': 10.0}