I'm new to Pandas (0.16.1), and want custom sort in multiindex so i use Categoricals. Part of my multiindex:
Part Defect Own
Кузов 504 ИП
Кузов 504 Итого
Кузов 504 ПС
Кузов 505 ПС
Кузов 506 ПС
Кузов 507 ПС
Кузов 530 ИП
Кузов 530 Итого
Кузов 530 ПС
I create pivot table with MultiIndex levels [Defect, Own]. Then i make "Own" Categorical (see p.s. part of question) to sort it as [ИП, ПС, Итого]. But when i prepend levels with "Part", which is also Categorical based on "Defect" level, and sort index with
pvt.sortlevel(0, inplace=True)
"Own" level is sorted in alphabetical order: [ИП, Итого, ПС]. How can i custom-sort two levels in multiindex?
P. S. I convert "Own" level to Categorical with the following code: create new column, replace index level with it. Is it ok?
def makeLevelCategorical(pdf, pname, cats):
names = pdf.index.names
namei = names.index(pname)
pdf["tmp"] = pd.Categorical(pdf.index.get_level_values(pname), categories=cats) #New temp column
pdf.set_index("tmp", append=True, inplace=True) #Append column to index
pdf = pdf.reset_index(pname, drop=True) #Remove /pname/ level
names2 = list(names)
names2[namei] = "tmp"
pdf.reorder_levels(names2) #Put "tmp" level to /pname/'s position
pdf.index.names = names #Rename "tmp" level to /pname/
return pdf
Sorting a multiindex can be done using the Dataframe.sort_index function.
Here is a small example:
Outputs:
If you want to change the sort order on column basis, the Dataframe.sort_index function takes an argument
ascending=
which can be given a list of[True, False]
statements corresponding to the columns in order.Categorical is a new shiny dtype in pandas and it should be used, but it is not needed for this operation per se.
Edit due to comment:
Sort will always sort alphabetically or in reverse order. If you want custom sort, then you need to create a new column which can be sorted alphabetically but is a result of the column which can determine the sorting. Do this using Series.map, like this example, that sorts the datasets with vowels first:
If you do not want the sortby column after that, you can simply delete it, like this: