-->

Generate a pandas dataframe from ordereddict?

2020-02-05 02:45发布

问题:

I am trying to create a pandas dataframe from an ordereddict to preserve the order of the values. But for some reason after creating the dataframe the fields are messed up again.

Here's the list of ordereddicts:

[OrderedDict([
  ('key_a',
  'value_a'),
  ('key_b',
  'value_b'),
]),
OrderedDict([
  ('key_a',
  'value_c'),
  ('key_b',
  'value_d'),
])
]

Now how should I create a pandas DataFrame from these? What I am looking for is something like that (the important thing is the key_a and key_b etc column name order):

  key_a    key_b
0 value_a  value_b
1 value_c  value_d

I have tried:

pd.DataFrame.from_records(orderedDictList)
pd.DataFrame.from_dict(orderedDictList)

Feel free to ask any additional questions.

回答1:

Following this answer, you need to explicitly specify your column order:

df = pd.DataFrame(orderedDictList, columns=orderedDictList.keys())

Of course, first you need to ensure that you have an ordered dictionary.



回答2:

You can do something like this using Counter from collections module:

from collections import OrderedDict, Counter
import pandas as pd

a = {'key_a': 'value_a', 'key_b':'value_b'}
b = {'key_a': 'value_c', 'key_b':'value_d'}

ord_list = [OrderedDict(a), OrderedDict(b)]

col = Counter()
for k in ord_list:
    col.update(k)


df = pd.DataFrame([k.values() for k in ord_list], columns = col.keys())

print(df)

Output:

     key_b    key_a
0  value_b  value_a
1  value_d  value_c


回答3:

Following @AdeelAhmad answer I needed to get the columns from one OrderDict:

df = pd.DataFrame(orderedDictList, columns=orderedDictList[0].keys())

Hope it helps



回答4:

df = pd.DataFrame(orderedDictList, columns=orderedDictList.keys())

This does not work because lists do not have key value pairs.

You do not need to specify column order because orderedDicts will preserve the column order inherently. The following will work:

df = pd.DataFrame(orderedDictList)