Reshape Long Format Multivalue Dataframes with Pan

2019-04-01 23:29发布

I would like to turn:

DateTime                     ColumnName        Min      Avg      Max                                                                                      
2012-10-14 11:29:23.810000   Percent_Used       24       24       24
2012-10-14 11:29:23.810000   Current_Count  254503   254503   254503
2012-10-14 11:29:23.810000   Max           1048576  1048576  1048576
2012-10-14 11:34:23.813000   Percent_Used       24       24       24
2012-10-14 11:34:23.813000   Current_Count  254116   254116   254116
2012-10-14 11:34:23.813000   Max           1048576  1048576  1048576

Into a dataframe where the the DateTimes are unique (an index) and the columns are:

DataTime, Percent_Used_Min, Percent_Used_Avg, Percent_Used_Max, Current_Count_Min, Current_Count_Avg, Current_Count_Max, Max_Min, Max_Avg, Max_Max

Basically, I want to mimic R's melt/cast without getting into hierarchical indexing or stacked dataframes. I can't seem to to get exactly the above playing with stack/unstack, melt, or pivot/pivot_table -- Is there a good way to do this?

As An example, in R it would be something like:

dynamic_melt = melt(dynamic, id = c("DateTime", "ColumnName"))
recast = data.frame(cast(dynamic_melt, DateTime ~ ...))

The above data will be variable (i.e. the values of ColumnName won't always be the same thing, there might be more or less of them, and different names).

1条回答
贪生不怕死
2楼-- · 2019-04-01 23:53

There is a melt in pandas.core.reshape:

In [52]: melted = reshape.melt(df, id_vars=['DateTime', 'ColumnName'])

In [53]: melted.set_index(['DateTime', 'ColumnName', 'variable']).value.unstack([1, 2])
Out[53]: 
ColumnName                  Percent_Used  Current_Count      Max  Percent_Used  Current_Count      Max  Percent_Used  Current_Count      Max
variable                             Min            Min      Min           Avg            Avg      Avg           Max            Max      Max
DateTime                                                                                                                                    
2012-10-14 11:29:23.810000            24         254503  1048576            24         254503  1048576            24         254503  1048576
2012-10-14 11:34:23.813000            24         254116  1048576            24         254116  1048576            24         254116  1048576

The columns end up being a MultiIndex, but if that's a deal breaker for you just concat the names and make it a regular Index.

查看更多
登录 后发表回答