I have a python array listing all occurences of string labels. Let's call it labels_array.
Using seaborn as sns I d like to show a countplot of this array :
sns.countplot(labels_array)
This works, but as they are too many different labels in my array, the outpout doesnt look good.
Is there a way to display only the n most frequent labels.
Although countplot
should in principle know the counts and hence allow to show only part of them, this is not the case. Therefore, the use of countplot may not make too much sense here.
Instead just use a normal pandas plot. E.g. to show the 5 most frequent items in the list,
pandas.Series(labels_array).value_counts()[:5].plot(kind="bar")
Complete example:
import string
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
l = list(string.ascii_lowercase)
n = np.random.rand(len(l))
a = np.random.choice(l, p=n/n.sum(),size=400)
s = pd.Series(a)
s.value_counts()[:5].plot(kind="bar")
plt.show()
I came across the same problem (and this question) and found that this question has already been answered.
The countplot
function has the parameter order
where you can specify for which values you want to plot the counts.
The most often occurred values can be obtained, as previously stated, with the value_counts
function.
See:
limit the number of groups shown in seaborn countplot?