Generate a pandas dataframe from ordereddict?

2020-02-05 02:06发布

I am trying to create a pandas dataframe from an ordereddict to preserve the order of the values. But for some reason after creating the dataframe the fields are messed up again.

Here's the list of ordereddicts:

[OrderedDict([
  ('key_a',
  'value_a'),
  ('key_b',
  'value_b'),
]),
OrderedDict([
  ('key_a',
  'value_c'),
  ('key_b',
  'value_d'),
])
]

Now how should I create a pandas DataFrame from these? What I am looking for is something like that (the important thing is the key_a and key_b etc column name order):

  key_a    key_b
0 value_a  value_b
1 value_c  value_d

I have tried:

pd.DataFrame.from_records(orderedDictList)
pd.DataFrame.from_dict(orderedDictList)

Feel free to ask any additional questions.

4条回答
ら.Afraid
2楼-- · 2020-02-05 02:37

Following this answer, you need to explicitly specify your column order:

df = pd.DataFrame(orderedDictList, columns=orderedDictList.keys())

Of course, first you need to ensure that you have an ordered dictionary.

查看更多
3楼-- · 2020-02-05 02:40

You can do something like this using Counter from collections module:

from collections import OrderedDict, Counter
import pandas as pd

a = {'key_a': 'value_a', 'key_b':'value_b'}
b = {'key_a': 'value_c', 'key_b':'value_d'}

ord_list = [OrderedDict(a), OrderedDict(b)]

col = Counter()
for k in ord_list:
    col.update(k)


df = pd.DataFrame([k.values() for k in ord_list], columns = col.keys())

print(df)

Output:

     key_b    key_a
0  value_b  value_a
1  value_d  value_c
查看更多
女痞
4楼-- · 2020-02-05 02:58

Following @AdeelAhmad answer I needed to get the columns from one OrderDict:

df = pd.DataFrame(orderedDictList, columns=orderedDictList[0].keys())

Hope it helps

查看更多
等我变得足够好
5楼-- · 2020-02-05 03:02
df = pd.DataFrame(orderedDictList, columns=orderedDictList.keys())

This does not work because lists do not have key value pairs.

You do not need to specify column order because orderedDicts will preserve the column order inherently. The following will work:

df = pd.DataFrame(orderedDictList)
查看更多
登录 后发表回答