Drawing points with with median lines in seaborn u

2019-06-03 16:05发布

I have the following plot in seaborn:

df = pandas.DataFrame({"sample": ["X", "X", "X", "Y", "Y", "Y"],
                       "value": [0.2, 0.3, 0.4, 0.7, 0.75, 0.8],
                       "rep": ["a", "b", "c", "a", "b", "c"]})
plt.figure()
ax = sns.stripplot(x="sample", y="value", edgecolor="none",
                   hue="sample", palette="Set1", data=df)

# how to plot median line?
plt.show()

It plots the points in gray scale colors instead of using Set1 and only shows X in the legend and not Y:

s

I also want to add a horizontal line at the median for X and Y. how can this be done? factorplot doesn't appear to have a horizontal line option.

2条回答
Lonely孤独者°
2楼-- · 2019-06-03 16:22

You may plot lines by using matplolib. Pandas may calculate medians value for your dataset. I use seaborn 0.7.0 in this example:

from pandas import DataFrame
import matplotlib.pyplot as plt
import seaborn as sns

df = DataFrame({"sample": ["X", "X", "X", "Y", "Y", "Y"],
                       "value": [0.2, 0.3, 0.4, 0.7, 0.75, 0.8],
                       "rep": ["a", "b", "c", "a", "b", "c"]})
# calc medians
xmed = df.loc[df["sample"] == 'X'].median()['value']
ymed = df.loc[df["sample"] == 'Y'].median()['value']

sns.stripplot(x="sample", y="value", edgecolor="none",
 hue="sample", palette="Set1", data=df)

x = plt.gca().axes.get_xlim()

# how to plot median line?
plt.plot(x, len(x) * [xmed], sns.xkcd_rgb["pale red"])
plt.plot(x, len(x) * [ymed], sns.xkcd_rgb["denim blue"])
plt.show()

enter image description here

查看更多
放我归山
3楼-- · 2019-06-03 16:26

We can limit the width of each median line to its respective column by looping through the Axes ticks and ticklabels after generating the stripplot. This also enables the code to operate independent of the number of samples (columns) to be plotted.


    import pandas as pd
    import seaborn as sns
    import matplotlib.pyplot as plt

    df = pd.DataFrame({"sample": ["X", "X", "X", "Y", "Y", "Y"],
                       "value": [0.2, 0.3, 0.4, 0.7, 0.75, 0.8],
                       "rep": ["a", "b", "c", "a", "b", "c"]})

    ax = sns.stripplot(x="sample", y="value", data=df, palette="Set1", s=8)

    # distance across the "X" or "Y" stipplot column to span, in this case 40%
    median_width = 0.4

    for tick, text in zip(ax.get_xticks(), ax.get_xticklabels()):
        sample_name = text.get_text()  # "X" or "Y"

        # calculate the median value for all replicates of either X or Y
        median_val = df[df['sample']==sample_name].value.median()

        # plot horizontal lines across the column, centered on the tick
        ax.plot([tick-median_width/2, tick+median_width/2], [median_val, median_val],
                lw=4, color='k')

    plt.show()

seaborn stripplot with median lines drawn: seaborn stripplot with median lines drawn

查看更多
登录 后发表回答