Using pandas cut I can define bins by providing the edges and pandas creates bins like (a, b]
.
My question is how can I sort the bins (from the lowest to the highest)?
import numpy as np
import pandas as pd
y = pd.Series(np.random.randn(100))
x1 = pd.Series(np.sign(np.random.randn(100)))
x2 = pd.cut(pd.Series(np.random.randn(100)), bins = [-3, -0.5, 0, 0.5, 3])
model = pd.concat([y, x1, x2], axis = 1, keys = ['Y', 'X1', 'X2'])
I have an intermediate result where the order of the bins is preserved
int_output = model.groupby(['X1', 'X2']).mean().unstack()
int_output.columns = int_output.columns.get_level_values(1)
X2 (-3, -0.5] (-0.5, 0] (0, 0.5] (0.5, 3]
X1
-1.0 0.101475 -0.344419 -0.482992 -0.015179
1.0 0.249961 0.484757 -0.066383 -0.249414
But then I do other operations that arbitrarily changes the order of the bins:
output = pd.concat(int_output.to_dict('series'), axis = 1)
(-0.5, 0] (-3, -0.5] (0, 0.5] (0.5, 3]
X1
-1.0 -0.344419 0.101475 -0.482992 -0.015179
1.0 0.484757 0.249961 -0.066383 -0.249414
Now I would like to plot the data in a bar chart, but I want the bins to be sorted from the lowest (-3, -0.5] to the highest (0.5, 3].
I think I can achieve this by manipulating the string, using a split on "," and then cleaning brackets, but I would like to know if there is a better way.
There is main problem losing
ordered
CategoricalIndex
.One possible solution is
extract
first number fromoutput.columns
, create helper Series and sort it. Lastreindex
original columns:An easy fix to the problem you've highlighted above is to simply reorder the columns:
I made a function to do so.