Pretty-print an entire Pandas Series / DataFrame

2019-01-01 09:50发布

问题:

I work with Series and DataFrames on the terminal a lot. The default __repr__ for a Series returns a reduced sample, with some head and tail values, but the rest missing.

Is there a builtin way to pretty-print the entire Series / DataFrame? Ideally, it would support proper alignment, perhaps borders between columns, and maybe even color-coding for the different columns.

回答1:

You can also use the option_context, with one or more options:

with pd.option_context(\'display.max_rows\', None, \'display.max_columns\', None):
    print(df)

This will automatically return the options to their default values.

If you are working on jupyter-notebook, using display instead of print will use jupyter rich display logic.



回答2:

No need to hack settings. There is a simple way:

print(df.to_string())


回答3:

Sure, if this comes up a lot, make a function like this one. You can even configure it to load every time you start IPython: https://ipython.org/ipython-doc/1/config/overview.html

def print_full(x):
    pd.set_option(\'display.max_rows\', len(x))
    print(x)
    pd.reset_option(\'display.max_rows\')

As for coloring, getting too elaborate with colors sounds counterproductive to me, but I agree something like bootstrap\'s .table-striped would be nice. You could always create an issue to suggest this feature.



回答4:

After importing pandas, as an alternative to using the context manager, set such options for displaying entire dataframes:

pd.set_option(\'display.max_columns\', None)  # or 1000
pd.set_option(\'display.max_rows\', None)  # or 1000
pd.set_option(\'display.max_colwidth\', -1)  # or 199

For full list of useful options, see:

pd.describe_option(\'display\')


回答5:

Use the tabulate package:

pip install tabulate

And consider the following example usage:

import pandas as pd
from io import StringIO
from tabulate import tabulate

c = \"\"\"Chromosome Start End
chr1 3 6
chr1 5 7
chr1 8 9\"\"\"

df = pd.read_table(StringIO(c), sep=\"\\s+\", header=0)

print(tabulate(df, headers=\'keys\', tablefmt=\'psql\'))

+----+--------------+---------+-------+
|    | Chromosome   |   Start |   End |
|----+--------------+---------+-------|
|  0 | chr1         |       3 |     6 |
|  1 | chr1         |       5 |     7 |
|  2 | chr1         |       8 |     9 |
+----+--------------+---------+-------+


回答6:

Try this

pd.set_option(\'display.height\',1000)
pd.set_option(\'display.max_rows\',500)
pd.set_option(\'display.max_columns\',500)
pd.set_option(\'display.width\',1000)


回答7:

If you are using Ipython Notebook (Jupyter). You can use HTML

from IPython.core.display import HTML
display(HTML(df.to_html()))


回答8:

You can achieve this using below method. just pass the total no. of columns present in the DataFrame as arg to

\'display.max_columns\'

For eg :

df= DataFrame(..)
with pd.option_context(\'display.max_rows\', None, \'display.max_columns\', df.shape[1]):
    print(df)


回答9:

This answer is a variation of the prior answer by lucidyan. It makes the code more readable by avoiding the use of set_option.

After importing pandas, as an alternative to using the context manager, set such options for displaying large dataframes:

def set_pandas_options() -> None:
    pd.options.display.max_columns = 1000
    pd.options.display.max_rows = 1000
    pd.options.display.max_colwidth = 199
    pd.options.display.width = None
    # pd.options.display.precision = 2  # set as needed

set_pandas_options()

After this, you can use display(df) or just df if using a notebook, otherwise print(df).