i am dealing with a dataset that shows relationships between two points, such as bus stops. For example, we have bus stops A, B, C, and D.
I want to make histogram plot that shows, for each bus stop, how long it takes to get to the other 3 bus stops.
Obviously, there is no time from A to A, therefore, that should be blank.
When I plot it, I see that the first row shows B C D, the second row shows A, C, D, etc. The columns are misaligned and the colors don't represent the same column in each row.
If I add sharex = True, it simply just removes the x labels on each axis. That's obviously not what I want to see here.
I would instead like to see 4 columns in the order of A, B, C, D. When it's A to A, it should just be blank, and the colors should be consistent.
Does anyone know how to accomplish this?
import pandas as pd
import numpy as np
import seaborn as sns
%matplotlib inline
time=np.random.randn(1000)
point1 = ['A','B','C','D'] * 250
point2 = ['A'] * 250 + ['B'] * 250 + ['C'] * 250 + ['D'] * 250
df_time = pd.DataFrame(
{'point1': point1,
'point2': point2,
'time': time
})
df_time=df_time[df_time['point1']!=df_time['point2']] ##cannot sell to another
fig, ax = plt.subplots(nrows=4, sharey=True)
fig.set_size_inches(12, 16)
for point1i, axi in zip(point1, ax.ravel()):
sns.boxplot(data=df_time[df_time['point1']==point1i], x='point2', y='time', ax=axi)
As seen from the documentation,
sns.boxplot
has an argumenorder
Using this like
would give you the desired plot.
Complete code: