Labelling a matplotlib histogram bin with an arrow

2019-04-06 22:23发布

问题:

I have a histogram plot which could be replicated with the MWE below:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

pd.Series(np.random.normal(0, 100, 1000)).plot(kind='hist', bins=50)

Which creates a plot like this:

How would I then go about labelling the bin with an arrow for a given integer?

For example see below, where an arrow labels the bin containing the integer 300.

EDIT: I should add ideally the y coordinates of the arrow should be set automatically by the height of the bar it is labelling - if possible!

回答1:

you can use annotate to add an arrow:

import pandas as pd
import matplotlib.pyplot as plt
#import seaborn as sns
import numpy as np

fig, ax = plt.subplots()
series = pd.Series(np.random.normal(0, 100, 1000))
series.plot(kind='hist', bins=50, ax=ax)
ax.annotate("",
            xy=(300, 5), xycoords='data',
            xytext=(300, 20), textcoords='data',
            arrowprops=dict(arrowstyle="->",
                            connectionstyle="arc3"),
            )

In this example, I added an arrow that goes from coordinates (300, 20) to (300, 5).

In order to automatically scale your arrow to the value in the bin, you can use matplotlib hist to plot the histogram and get the values back and then use numpy where to find which bin corresponds to the desired position.

import pandas as pd
import matplotlib.pyplot as plt
#import seaborn as sns
import numpy as np

nbins = 50
labeled_bin = 200

fig, ax = plt.subplots()

series = pd.Series(np.random.normal(0, 100, 1000))

## plot the histogram and return the bin position and values
ybins, xbins, _ = ax.hist(series, bins=nbins)

## find out in which bin belongs the position where you want the label
ind_bin = np.where(xbins >= labeled_bin)[0]
if len(ind_bin) > 0 and ind_bin[0] > 0:
    ## get position and value of the bin
    x_bin = xbins[ind_bin[0]-1]/2. + xbins[ind_bin[0]]/2.
    y_bin = ybins[ind_bin[0]-1]
    ## add the arrow
    ax.annotate("",
                xy=(x_bin, y_bin + 5), xycoords='data',
                xytext=(x_bin, y_bin + 20), textcoords='data',
                arrowprops=dict(arrowstyle="->",
                                connectionstyle="arc3"),
                                )
else:
    print "Labeled bin is outside range"


回答2:

@Julien Spronck showed the best way, I think. Alternatively, you can also use arrow; the example code can be found below. The y-ccordinate is determined automatically by calculating how many elements are in a certain bin (with a certain tolerance which you can define yourself). You can play with the parameters (length of arrow head, length of arrow). Here is the code:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

mySer = pd.Series(np.random.normal(0, 100, 1000))
mySer.plot(kind='hist', bins=50)

# that is where you want to add the arrow
ind = 200
# determine how many elements you have in the bin (with a certain tolerance)
n = len(mySer[(mySer > ind*0.95) & (mySer <  ind*1.05)])

# define length of the arrow
lenArrow = 10
lenHead = 2
wiArrow = 5
plt.arrow(ind, n+lenArrow+lenHead, 0, -lenArrow, head_width=wiArrow+3, head_length=lenHead, width=wiArrow, fc='k', ec='k')

plt.show()

This gives you the following output (for 200 instead of 300 as in your example):