The histograms' color and its labels are incon

I'm trying to analyze the wine-quality dataset. There are two datasets: the red wine dataset and the white wine. I combine them together to form the wine_df. I want to plot it. And I want to give the red histogram red color, the white histogram white color. But for some histogram, its label and its color are inconsistent. For example, the fourth one's label is (4,white), while its color is red. What should I do? Thanks for your answer!

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

red_wine = pd.read_csv('https://raw.githubusercontent.com/nishanthgandhidoss/Wine-Quality/master/data/winequality-red.csv',
                      sep = ';')
white_wine = pd.read_csv('https://raw.githubusercontent.com/nishanthgandhidoss/Wine-Quality/master/data/winequality-white.csv', 
                        sep = ';')

## Add a column to each data to identify the wine color 
red_wine['color'] = 'red'
white_wine['color'] = 'white'

## Combine the two dataframes    
wine_df = pd.concat([red_wine, white_wine])

colors = ['red','white']
plt.style.use('ggplot')
counts = wine_df.groupby(['quality', 'color']).count()['pH']
counts.plot(kind='bar', title='Counts by Wine Color and quality', color=colors, alpha=.7)
plt.xlabel('Quality and Color', fontsize=18)
plt.ylabel('Count', fontsize=18)
plt.show()

标签： python pandas matplotlib seaborn

1条回答

爷、活的狠高调

2楼-- · 2019-07-24 22:01

The colors are a level of your index, so use that to specify colors. Change your line of code to:

counts.plot(kind='bar', title='Counts by Wine Color and quality', 
            color=counts.index.get_level_values(1), alpha=.7)

In this case it just turns out that matplotlib could interpret the values in your index as colors. In general, you could have mapped the unique values to recognizable colors, for instance:

color = counts.index.get_level_values(1).map({'red': 'green', 'white': 'black'})

pandas is doing something with the plotting order, but you could always fall back to matplotlib to cycle the colors more reliably. The trick here is to convert color to a categorical variable so it's always represented after the groupby allowing you to specify only the list ['red', 'white']

import matplotlib.pyplot as plt

wine_df['color'] = wine_df.color.astype('category')
counts = wine_df.groupby(['quality', 'color']).count()['pH'].fillna(0)

ind = np.arange(len(counts))
plt.bar(ind, height=counts.values, color=['red', 'white'])
_ = plt.xticks(ind, counts.index.values, rotation=90)
plt.ylim(0,150)  # So we an see (9, white)
plt.show()

0人赞添加讨论(0) 举报

The histograms' color and its labels are incon

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间