How can I left justify text in a pandas DataFrame

2019-01-28 01:11发布

问题:

I am trying to format the output in an IPython notebook. I tried using the to_string function, and this neatly lets me eliminate the index column. But the textual data is right justified.

In [10]:

import pandas as pd
columns = ['Text', 'Value']
a = pd.DataFrame ({'Text': ['abcdef', 'x'], 'Value': [12.34, 4.2]})
print (a.to_string (index=False))

   Text  Value
 abcdef  12.34
      x   4.20

The same is true when just printing the dataframe.

In [12]:

print (a)

     Text  Value
0  abcdef  12.34
1       x   4.20

The justify argument in the to_string function, surprisingly, only justifies the column heading.

In [13]:

import pandas as pd
columns = ['Text', 'Value']
a = pd.DataFrame ({'Text': ['abcdef', 'x'], 'Value': [12.34, 4.2]})
print (a.to_string (justify='left', index=False))
Text     Value
 abcdef  12.34
      x   4.20

How can I control the justification settings for individual columns?

回答1:

If you're willing to use another library, tabulate will do this -

$ pip install tabulate

and then

from tabulate import tabulate
df = pd.DataFrame ({'Text': ['abcdef', 'x'], 'Value': [12.34, 4.2]})
print(tabulate(df, showindex=False, headers=df.columns))

Text      Value
------  -------
abcdef    12.34
x          4.2

It has various other output formats also.



回答2:

You could use a['Text'].str.len().max() to compute the length of the longest string in a['Text'], and use that number, N, in a left-justified formatter '{:<Ns}'.format:

In [211]: print(a.to_string(formatters={'Text':'{{:<{}s}}'.format(a['Text'].str.len().max()).format}, index=False))
   Text  Value
 abcdef  12.34
 x        4.20


回答3:

I converted @unutbu's approach to a function so I could left-justify my dataframes.

my_df = pd.DataFrame({'StringVals': ["Text string One", "Text string Two", "Text string Three"]})

def left_justified(df):
    formatters = {}
    for li in list(df.columns):
        max = df[li].str.len().max()
        form = "{{:<{}s}}".format(max)
        formatters[li] = functools.partial(str.format, form)
    return df.to_string(formatters=formatters, index=False)

So now this:

print(my_df.to_string())

          StringVals
0    Text string One
1    Text string Two
2  Text string Three

becomes this:

print(left_justified(my_df))

StringVals
Text string One  
Text string Two  
Text string Three

Note, however, any non-string values in your dataframe will give you errors:

AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

You'll have to pass different format strings to .to_string() if you want it to work with non-string values:

my_df2 = pd.DataFrame({'Booleans'  : [False, True, True],
                       'Floats'    : [1.0, 0.4, 1.5],           
                       'StringVals': ["Text string One", "Text string Two", "Text string Three"]})

FLOAT_COLUMNS = ('Floats',)
BOOLEAN_COLUMNS = ('Booleans',)

def left_justified2(df):
    formatters = {}

    # Pass a custom pattern to format(), based on
    # type of data
    for li in list(df.columns):
        if li in FLOAT_COLUMNS:
           form = "{{!s:<5}}".format()
        elif li in BOOLEAN_COLUMNS:
            form = "{{!s:<8}}".format()
        else:
            max = df[li].str.len().max()
            form = "{{:<{}s}}".format(max)
        formatters[li] = functools.partial(str.format, form)
    return df.to_string(formatters=formatters, index=False)

With floats and booleans:

print(left_justified2(my_df2))

Booleans Floats         StringVals
False     1.0    Text string One  
True      0.4    Text string Two  
True      1.5    Text string Three

Note this approach is a bit of a hack. Not only do you have to maintain column names in a separate lists, but you also have to best-guess at the data widths. Perhaps someone with better Pandas-Fu can demonstrate how to automate parsing the dataframe info to generate the formats automatically.