Python - Using pandas to format excel cell

2020-02-26 02:13发布

问题:

I have a pandas dataframe, which is something like shown below.

I would like to format the column "Pass/Fail" as if Fail --> red background, else green background, like:

I have tried to use Pandas to do the formatting, but it fails to add color to the excel. Following is the code:

writer = pandas.ExcelWriter(destination,engine = 'xlsxwriter')
color = Answer.style.applymap(lambda x: 'color: red' if x == "Fail" else 'color: green',subset= pandas.IndexSlice[:,['Pass/Fail']])
color.to_excel(writer,'sheet1')

I tried StyleFrame which failed to install. Seems that StyleFrame does not comply with my python version 3.6.

How can I format the excel as I want?

回答1:

You can use conditional_format:

df = pd.DataFrame({'Pass/Fail':['Pass','Fail','Fail'],
                   'expect':[1,2,3]})
print (df)
  Pass/Fail  expect
0      Pass       1
1      Fail       2
2      Fail       3

writer = pd.ExcelWriter('pandas_conditional.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
workbook  = writer.book
worksheet = writer.sheets['Sheet1']
red_format = workbook.add_format({'bg_color':'red'})
green_format = workbook.add_format({'bg_color':'green'})

worksheet.conditional_format('B2:B4', {'type': 'text',
                                      'criteria': 'containing',
                                       'value':     'Fail',
                                       'format': red_format})

worksheet.conditional_format('B2:B4', {'type': 'text',
                                      'criteria': 'containing',
                                       'value':   'Pass',
                                       'format':  green_format})
writer.save()

More dynamic solution with get_loc for position of column and mapping with dictionary:

import string

df = pd.DataFrame({'Pass/Fail':['Pass','Fail','Fail'],
                   'expect':[1,2,3]})
print (df)
  Pass/Fail  expect
0      Pass       1
1      Fail       2
2      Fail       3

writer = pd.ExcelWriter('pandas_conditional.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
workbook  = writer.book
worksheet = writer.sheets['Sheet1']
red_format = workbook.add_format({'bg_color':'red'})
green_format = workbook.add_format({'bg_color':'green'})

#dict for map excel header, first A is index, so omit it
d = dict(zip(range(25), list(string.ascii_uppercase)[1:]))
print (d)
{0: 'B', 1: 'C', 2: 'D', 3: 'E', 4: 'F', 5: 'G', 6: 'H', 7: 'I', 8: 'J',
 9: 'K', 10: 'L', 11: 'M', 12: 'N', 13: 'O', 14: 'P', 15: 'Q', 16: 'R', 
 17: 'S', 18: 'T', 19: 'U', 20: 'V', 21: 'W', 22: 'X', 23: 'Y', 24: 'Z'}

#set column for formatting
col = 'Pass/Fail'
excel_header = str(d[df.columns.get_loc(col)])
#get length of df
len_df = str(len(df.index) + 1)
rng = excel_header + '2:' + excel_header + len_df
print (rng)
B2:B4

worksheet.conditional_format(rng, {'type': 'text',
                                      'criteria': 'containing',
                                       'value':     'Fail',
                                       'format': red_format})

worksheet.conditional_format(rng, {'type': 'text',
                                      'criteria': 'containing',
                                       'value':   'Pass',
                                       'format':  green_format})
writer.save()

EDIT1:

Thank you jmcnamara for comment and for XlsxWriter

col = 'Pass/Fail'
loc = df.columns.get_loc(col) + 1
len_df = len(df.index) + 1

worksheet.conditional_format(1,loc,len_df,loc, {'type': 'text',
                                      'criteria': 'containing',
                                       'value':     'Fail',
                                       'format': red_format})

worksheet.conditional_format(1,loc,len_df,loc, {'type': 'text',
                                      'criteria': 'containing',
                                       'value':   'Pass',
                                       'format':  green_format})
writer.save()

EDIT:

Another solution with last version of pandas (0.20.1) and styles:

df = pd.DataFrame({'Pass/Fail':['Pass','Fail','Fail'],
                   'expect':['d','f','g']})
print (df)
  Pass/Fail expect
0      Pass      d
1      Fail      f
2      Fail      g

def f(x):
    col = 'Pass/Fail'
    r = 'background-color: red'
    g = 'background-color: green'
    c = np.where(x[col] == 'Pass', g, r)
    y = pd.DataFrame('', index=x.index, columns=x.columns)
    y[col] = c
    return y

styled = df.style.apply(f, axis=None)
styled.to_excel('styled.xlsx', engine='openpyxl')


回答2:

Disclaimer: I wrote the following library

I'd like to suggest using StyleFrame:

import pandas as pd
from StyleFrame import StyleFrame, Styler

df = pd.DataFrame({'Pass/Fail':['Pass','Fail','Fail'],
                   'expect':[1,2,3]})

sf = StyleFrame(df)

sf.apply_style_by_indexes(sf[sf['Pass/Fail'] == 'Pass'], cols_to_style='Pass/Fail',
                          styler_obj=Styler(bg_color='green'))
sf.apply_style_by_indexes(sf[sf['Pass/Fail'] == 'Fail'], cols_to_style='Pass/Fail',
                          styler_obj=Styler(bg_color='red'))

sf.to_excel('test.xlsx').save()

Since it bridges the gap between pandas and openpyxl, the styling is done on the dataframe level instead of the worksheet level (so for example you don't need to know the relevant cell range is B2:B4 or mess with indexes.

The code above outputs the following:

EDIT: Just saw you mentioned you've tried to install but got an error. Can you edit your question and include the error?



回答3:

If have one or more columns and more than two values to format, and want to apply multiple format rules at once then you can do the following:

def fmt(data, fmt_dict):
    return data.replace(fmt_dict)

styled = df.style.apply(fmt, fmt_dict=fmt_dict, subset=['Test_1', 'Test_2' ])
styled.to_excel('styled.xlsx', engine='openpyxl')

Above, fm_dict is a dictionary with the values mapped to the corresponding format:

fmt_dict = {
    'Pass': 'background-color: green',
    'Fail': 'background-color: red',
    'Pending': 'background-color: yellow; border-style: solid; border-color: blue'; color: red,
}

Notice that for the 'Pending' value, you can also specify multiple format rules (e.g. border, background color, foreground color)

(Requires: openpyxl and jinja2)


Here is a full running example:

import pandas as pd

df = pd.DataFrame({'Test_1':['Pass','Fail', 'Pending', 'Fail'],
                   'expect':['d','f','g', 'h'],
                   'Test_2':['Pass','Pending', 'Pass', 'Fail'],
                  })

fmt_dict = {
    'Pass': 'background-color: green',
    'Fail': 'background-color: red',
    'Pending': 'background-color: yellow; border-style: solid; border-color: blue; color:red',
}

def fmt(data, fmt_dict):
    return data.replace(fmt_dict)

styled = df.style.apply(fmt, fmt_dict=fmt_dict, subset=['Test_1', 'Test_2' ])
styled.to_excel('styled.xlsx', engine='openpyxl')