I have a two different list of tuples that are converted into a pandas dataframe:
ls1 = [(1,"A",2),(1,"B",1),(1,"C",3),(2,"A",4),(2,"B",4,),(2,"C",5)]
ls2 = [(1,"A",2),(1,"C",3),(1,"B",1),(1,"D",6),(2,"A",4),(2,"C",5),(2,"B",4,)]
df1 = pandas.DataFrame(ls1, columns=['ID', 'Class', 'count'])
df2 = pandas.DataFrame(ls2, columns=['ID', 'Class', 'count'])
Now I'd like to make a pivot table from both dataframes with the column names for the classes "A", "B", "C", "D". So, all four column names (if possible also in a specified order) should exist in the resulting pivot table. If there are no counts for a ID-Class combination these should be filled with e.g. NaN.
dfpivot1 = df1.pivot(index='ID', columns='Class', values='count')
dfpivot2 = df2.pivot(index='ID', columns='Class', values='count')
>>> dfpivot1
Class A B C
ID
1 2 1 3
2 4 4 5
>>>
Using the .pivot
from pandas provides a pivot table for df1
, but with only three Class-columns ("A","B","C"). So, dfpivot1
needs to be modified to have the columns "A","B","C" AND "D" and thus, exactly match the columns of dfpivot2
. As I am starting from a list of tuples also other approaches (without using pandas
) might be interesting.