I am trying to format the output in an IPython notebook. I tried using the to_string function, and this neatly lets me eliminate the index column. But the textual data is right justified.
In [10]:
import pandas as pd
columns = ['Text', 'Value']
a = pd.DataFrame ({'Text': ['abcdef', 'x'], 'Value': [12.34, 4.2]})
print (a.to_string (index=False))
Text Value
abcdef 12.34
x 4.20
The same is true when just printing the dataframe.
In [12]:
print (a)
Text Value
0 abcdef 12.34
1 x 4.20
The justify argument in the to_string function, surprisingly, only justifies the column heading.
In [13]:
import pandas as pd
columns = ['Text', 'Value']
a = pd.DataFrame ({'Text': ['abcdef', 'x'], 'Value': [12.34, 4.2]})
print (a.to_string (justify='left', index=False))
Text Value
abcdef 12.34
x 4.20
How can I control the justification settings for individual columns?
If you're willing to use another library, tabulate will do this -
$ pip install tabulate
and then
from tabulate import tabulate
df = pd.DataFrame ({'Text': ['abcdef', 'x'], 'Value': [12.34, 4.2]})
print(tabulate(df, showindex=False, headers=df.columns))
Text Value
------ -------
abcdef 12.34
x 4.2
It has various other output formats also.
You could use a['Text'].str.len().max()
to compute the length of the longest string in a['Text']
, and use that number, N
, in a left-justified formatter '{:<Ns}'.format
:
In [211]: print(a.to_string(formatters={'Text':'{{:<{}s}}'.format(a['Text'].str.len().max()).format}, index=False))
Text Value
abcdef 12.34
x 4.20
I converted @unutbu's approach to a function so I could left-justify my dataframes.
my_df = pd.DataFrame({'StringVals': ["Text string One", "Text string Two", "Text string Three"]})
def left_justified(df):
formatters = {}
for li in list(df.columns):
max = df[li].str.len().max()
form = "{{:<{}s}}".format(max)
formatters[li] = functools.partial(str.format, form)
return df.to_string(formatters=formatters, index=False)
So now this:
print(my_df.to_string())
StringVals
0 Text string One
1 Text string Two
2 Text string Three
becomes this:
print(left_justified(my_df))
StringVals
Text string One
Text string Two
Text string Three
Note, however, any non-string values in your dataframe will give you errors:
AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas
You'll have to pass different format strings to .to_string()
if you want it to work with non-string values:
my_df2 = pd.DataFrame({'Booleans' : [False, True, True],
'Floats' : [1.0, 0.4, 1.5],
'StringVals': ["Text string One", "Text string Two", "Text string Three"]})
FLOAT_COLUMNS = ('Floats',)
BOOLEAN_COLUMNS = ('Booleans',)
def left_justified2(df):
formatters = {}
# Pass a custom pattern to format(), based on
# type of data
for li in list(df.columns):
if li in FLOAT_COLUMNS:
form = "{{!s:<5}}".format()
elif li in BOOLEAN_COLUMNS:
form = "{{!s:<8}}".format()
else:
max = df[li].str.len().max()
form = "{{:<{}s}}".format(max)
formatters[li] = functools.partial(str.format, form)
return df.to_string(formatters=formatters, index=False)
With floats and booleans:
print(left_justified2(my_df2))
Booleans Floats StringVals
False 1.0 Text string One
True 0.4 Text string Two
True 1.5 Text string Three
Note this approach is a bit of a hack. Not only do you have to maintain column names in a separate lists, but you also have to best-guess at the data widths. Perhaps someone with better Pandas-Fu can demonstrate how to automate parsing the dataframe info to generate the formats automatically.