How to save a pandas DataFrame table as a png

2020-01-23 15:30发布

I constructed a pandas dataframe of results. This data frame acts as a table. There are MultiIndexed columns and each row represents a name, ie index=['name1','name2',...] when creating the DataFrame. I would like to display this table and save it as a png (or any graphic format really). At the moment, the closest I can get is converting it to html, but I would like a png. It looks like similar questions have been asked such as How to save the Pandas dataframe/series data as a figure?

However, the marked solution converts the dataframe into a line plot (not a table) and the other solution relies on PySide which I would like to stay away simply because I cannot pip install it on linux. I would like this code to be easily portable. I really was expecting table creation to png to be easy with python. All help is appreciated.

标签: python pandas
7条回答
老娘就宠你
2楼-- · 2020-01-23 16:01

Pandas allows you to plot tables using matplotlib (details here). Usually this plots the table directly onto a plot (with axes and everything) which is not what you want. However, these can be removed first:

import matplotlib.pyplot as plt
import pandas as pd
from pandas.table.plotting import table # EDIT: see deprecation warnings below

ax = plt.subplot(111, frame_on=False) # no visible frame
ax.xaxis.set_visible(False)  # hide the x axis
ax.yaxis.set_visible(False)  # hide the y axis

table(ax, df)  # where df is your data frame

plt.savefig('mytable.png')

The output might not be the prettiest but you can find additional arguments for the table() function here. Also thanks to this post for info on how to remove axes in matplotlib.


EDIT:

Here is a (admittedly quite hacky) way of simulating multi-indexes when plotting using the method above. If you have a multi-index data frame called df that looks like:

first  second
bar    one       1.991802
       two       0.403415
baz    one      -1.024986
       two      -0.522366
foo    one       0.350297
       two      -0.444106
qux    one      -0.472536
       two       0.999393
dtype: float64

First reset the indexes so they become normal columns

df = df.reset_index() 
df
    first second       0
0   bar    one  1.991802
1   bar    two  0.403415
2   baz    one -1.024986
3   baz    two -0.522366
4   foo    one  0.350297
5   foo    two -0.444106
6   qux    one -0.472536
7   qux    two  0.999393

Remove all duplicates from the higher order multi-index columns by setting them to an empty string (in my example I only have duplicate indexes in "first"):

df.ix[df.duplicated('first') , 'first'] = ''
df
  first second         0
0   bar    one  1.991802
1          two  0.403415
2   baz    one -1.024986
3          two -0.522366
4   foo    one  0.350297
5          two -0.444106
6   qux    one -0.472536
7          two  0.999393

Change the column names over your "indexes" to the empty string

new_cols = df.columns.values
new_cols[:2] = '',''  # since my index columns are the two left-most on the table
df.columns = new_cols 

Now call the table function but set all the row labels in the table to the empty string (this makes sure the actual indexes of your plot are not displayed):

table(ax, df, rowLabels=['']*df.shape[0], loc='center')

et voila:

enter image description here

Your not-so-pretty but totally functional multi-indexed table.

EDIT: DEPRECATION WARNINGS

As pointed out in the comments, the import statement for table:

from pandas.tools.plotting import table

is now deprecated in newer versions of pandas in favour of:

from pandas.plotting import table 
查看更多
Deceive 欺骗
3楼-- · 2020-01-23 16:13

Although I am not sure if this is the result you expect, you can save your DataFrame in png by plotting the DataFrame with Seaborn Heatmap with annotations on, like this:

http://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.heatmap.html#seaborn.heatmap

Example of Seaborn heatmap with annotations on

It works right away with a Pandas Dataframe. You can look at this example: Efficiently ploting a table in csv format using Python

You might want to change the colormap so it displays a white background only.

Hope this helps.

查看更多
家丑人穷心不美
4楼-- · 2020-01-23 16:15

If you're okay with the formatting as it appears when you call the DataFrame in your coding environment, then the absolute easiest way is to just use print screen and crop the image using basic image editing software.

Here's how it turned out for me using Jupyter Notebook, and Pinta Image Editor (Ubuntu freeware).

查看更多
Explosion°爆炸
5楼-- · 2020-01-23 16:15

The following would need extensive customisation to format the table correctly, but the bones of it works:

import numpy as np
from PIL import Image, ImageDraw, ImageFont
import pandas as pd

df = pd.DataFrame({ 'A' : 1.,
                     'B' : pd.Series(1,index=list(range(4)),dtype='float32'),
                     'C' : np.array([3] * 4,dtype='int32'),
                     'D' : pd.Categorical(["test","train","test","train"]),
                     'E' : 'foo' })


class DrawTable():
    def __init__(self,_df):
        self.rows,self.cols = _df.shape
        img_size = (300,200)
        self.border = 50
        self.bg_col = (255,255,255)
        self.div_w = 1
        self.div_col = (128,128,128)
        self.head_w = 2
        self.head_col = (0,0,0)
        self.image = Image.new("RGBA", img_size,self.bg_col)
        self.draw = ImageDraw.Draw(self.image)
        self.draw_grid()
        self.populate(_df)
        self.image.show()
    def draw_grid(self):
        width,height = self.image.size
        row_step = (height-self.border*2)/(self.rows)
        col_step = (width-self.border*2)/(self.cols)
        for row in range(1,self.rows+1):
            self.draw.line((self.border-row_step//2,self.border+row_step*row,width-self.border,self.border+row_step*row),fill=self.div_col,width=self.div_w)
            for col in range(1,self.cols+1):
                self.draw.line((self.border+col_step*col,self.border-col_step//2,self.border+col_step*col,height-self.border),fill=self.div_col,width=self.div_w)
        self.draw.line((self.border-row_step//2,self.border,width-self.border,self.border),fill=self.head_col,width=self.head_w)
        self.draw.line((self.border,self.border-col_step//2,self.border,height-self.border),fill=self.head_col,width=self.head_w)
        self.row_step = row_step
        self.col_step = col_step
    def populate(self,_df2):
        font = ImageFont.load_default().font
        for row in range(self.rows):
            print(_df2.iloc[row,0])
            self.draw.text((self.border-self.row_step//2,self.border+self.row_step*row),str(_df2.index[row]),font=font,fill=(0,0,128))
            for col in range(self.cols):
                text = str(_df2.iloc[row,col])
                text_w, text_h = font.getsize(text)
                x_pos = self.border+self.col_step*(col+1)-text_w
                y_pos = self.border+self.row_step*row
                self.draw.text((x_pos,y_pos),text,font=font,fill=(0,0,128))
        for col in range(self.cols):
            text = str(_df2.columns[col])
            text_w, text_h = font.getsize(text)
            x_pos = self.border+self.col_step*(col+1)-text_w
            y_pos = self.border - self.row_step//2
            self.draw.text((x_pos,y_pos),text,font=font,fill=(0,0,128))
    def save(self,filename):
        try:
            self.image.save(filename,mode='RGBA')
            print(filename," Saved.")
        except:
            print("Error saving:",filename)




table1 = DrawTable(df)
table1.save('C:/Users/user/Pictures/table1.png')

The output looks like this:

enter image description here

查看更多
我只想做你的唯一
6楼-- · 2020-01-23 16:21

The best solution to your problem is probably:

df.to_html('table.html')
subprocess.call(
    'wkhtmltoimage -f png --width 0 table.html table.png', shell=True)

but you would need to get wkhtmltoimage/wkhtmltopdf yourself. There is also a Python package, pdfkit, to get you through this, but I do not see much advantage over running the command yourself.

I wished seaborn to be more customizable (or maybe easy to customize: I just could not figure out a proper way to embellish this over the past 30 min).

In my case, the results were pretty neat, e.g.:

enter image description here

and you could customize even further with CSS if you'd like to.

查看更多
老娘就宠你
7楼-- · 2020-01-23 16:25

The solution of @bunji works for me, but default options don't always give a good result. I added some useful parameter to tweak the appearance of the table.

import pandas as pd
import matplotlib.pyplot as plt
from pandas.tools.plotting import table
import numpy as np

dates = pd.date_range('20130101',periods=6)
df = pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))

df.index = [item.strftime('%Y-%m-%d') for item in df.index] # Format date

fig, ax = plt.subplots(figsize=(12, 2)) # set size frame
ax.xaxis.set_visible(False)  # hide the x axis
ax.yaxis.set_visible(False)  # hide the y axis
ax.set_frame_on(False)  # no visible frame, uncomment if size is ok
tabla = table(ax, df, loc='upper right', colWidths=[0.17]*len(df.columns))  # where df is your data frame
tabla.auto_set_font_size(False) # Activate set fontsize manually
tabla.set_fontsize(12) # if ++fontsize is necessary ++colWidths
tabla.scale(1.2, 1.2) # change size table
plt.savefig('table.png', transparent=True)

The result: Table

查看更多
登录 后发表回答