I've created a DataFrame in my desired date order, however, when I put this into a pivot table the order changes.
I wanted to sort the pivot table base on the newest date of any of the rows within a given level
data = [['yellow',1,'02/01/2015'],
['yellow',2,'04/01/2015'],
['green',3,'03/01/2015'],
['red',4,'01/01/2015']]
df = pd.DataFrame(data, columns=['colour','number','date'])
df.pivot_table(index=['number','date'])
The result is
number
colour date
green 03/01/2015 3
red 01/01/2015 4
yellow 02/01/2015 1
04/01/2015 2
I want the end result to be a list of colours which have newest dates to be at the top, basically a sort on the newest of the dates per row (the ones with the asterix around them). So the result would be:-
number
colour date
yellow 02/01/2015 2
*04/01/2015* 3
green *03/01/2015* 4
red *01/01/2015* 1
I can think of three solutions but I can't work them out
a) get pivot_table to keep the original order b) do a sort on the pivot_table using a func along the lines of latest_date_in_rows c) create an extra column containing the latest date against each colour
not sure which is the right route to take in the world of pandas, but at the moment I'm stuck :(
You can remember old
multiindex
before pivoting and then reindex output dataframe by oldmultiindex
.EDIT:
I think problem is that original dataframe isn't sorted. Its
multiindex
is:Output dataframe has
multiindex
sorted bycolour
:And you can sorted by level
date
, but multiindex and output is:So solution is
reindex
by originalmultiindex
.