Python textwrap Library - How to Preserve Line Bre

2019-03-15 13:31发布

问题:

When using Python's textwrap library, how can I turn this:

short line,

long line xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

into this:

short line,

long line xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx

I tried:

w = textwrap.TextWrapper(width=90,break_long_words=False)
body = '\n'.join(w.wrap(body))

But I get:

short line, long line xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

(spacing not exact in my examples)

回答1:

try

w = textwrap.TextWrapper(width=90,break_long_words=False,replace_whitespace=False)

that seemed to fix the problem for me

I worked that out from what I read here (I've never used textwrap before)



回答2:

body = '\n'.join(['\n'.join(textwrap.wrap(line, 90,
                 break_long_words=False, replace_whitespace=False))
                 for line in body.splitlines() if line.strip() != ''])


回答3:

How about wrap only lines longer then 90 characters?

new_body = ""
lines = body.split("\n")

for line in lines:
    if len(line) > 90:
        w = textwrap.TextWrapper(width=90, break_long_words=False)
        line = '\n'.join(w.wrap(line))

    new_body += line + "\n"


回答4:

It looks like it doesn't support that. This code will extend it to do what I need though:

http://code.activestate.com/recipes/358228/



回答5:

lines = text.split("\n")
lists = (textwrap.TextWrapper(width=90,break_long_words=False).wrap(line) for line in lines)
body  = "\n".join("\n".join(list) for list in lists)


回答6:

I had to a similar problem formatting dynamically generated docstrings. I wanted to preserve the newlines put in place by hand and split any lines over a certain length. Reworking the answer by @far a bit, this solution worked for me. I only include it here for posterity:

import textwrap

wrapArgs = {'width': 90, 'break_long_words': True, 'replace_whitespace': False}
fold = lambda line, wrapArgs: textwrap.fill(line, **wrapArgs)
body = '\n'.join([fold(line, wrapArgs) for line in body.splitlines()])


回答7:

TextWrapper is not designed to handle text that already has newlines in it.

There are a two things you may want to do when your document already has newlines:

1) Keep old newlines, and only wrap lines that are longer than the limit.

You can subclass TextWrapper as follows:

class DocumentWrapper(textwrap.TextWrapper):

    def wrap(self, text):
        split_text = text.split('\n')
        lines = [line for para in split_text for line in textwrap.TextWrapper.wrap(self, para)]
        return lines

Then use it the same way as textwrap:

d = DocumentWrapper(width=90)
wrapped_str = d.fill(original_str)

Gives you:

short line,
long line xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxx

2) Remove the old newlines and wrap everything.

original_str.replace('\n', '')
wrapped_str = textwrap.fill(original_str, width=90)

Gives you

short line,  long line xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

(TextWrapper doesn't do either of these - it just ignores the existing newlines, which leads to a weirdly formatted result)



回答8:

Here is a little module that can wrap text, break lines, handle extra indents (eg.a bulleted list), and replace characters/words with markdown!

class TextWrap_Test:
    def __init__(self):
        self.Replace={'Sphagnum':'$Sphagnum$','Equisetum':'$Equisetum$','Carex':'$Carex$',
                      'Salix':'$Salix$','Eriophorum':'$Eriophorum$'}
    def Wrap(self,Text_to_fromat,Width):
        Text = []
        for line in Text_to_fromat.splitlines():
            if line[0]=='-':
                wrapped_line = textwrap.fill(line,Width,subsequent_indent='  ')
            if line[0]=='*':
                wrapped_line = textwrap.fill(line,Width,initial_indent='  ',subsequent_indent='    ')
            Text.append(wrapped_line)
        Text = '\n\n'.join(text for text in Text)

        for rep in self.Replace:
            Text = Text.replace(rep,self.Replace[rep])
        return(Text)


Par1 = "- Fish Island is a low center polygonal peatland on the transition"+\
" between the Mackenzie River Delta and the Tuktoyaktuk Coastal Plain.\n* It"+\
" is underlain by continuous permafrost, peat deposits exceede the annual"+\
" thaw depth.\n* Sphagnum dominates the polygon centers with a caonpy of Equisetum and sparse"+\
" Carex.  Dwarf Salix grows allong the polygon rims.  Eriophorum and carex fill collapsed ice wedges."
TW=TextWrap_Test()
print(TW.Wrap(Par1,Text_W))

Will output:

  • Fish Island is a low center polygonal peatland on the transition between the Mackenzie River Delta and the Tuktoyaktuk Coastal Plain.

    • It is underlain by continuous permafrost, peat deposits exceede the annual thaw depth.

    • $Sphagnum$ dominates the polygon centers with a caonpy of $Equisetum$ and sparse $Carex$. Dwarf $Salix$ grows allong the polygon rims. $Eriophorum$ and carex fill collapsed ice wedges.

Characters between the $$ would be in italics if you were working in matplotlib for instance, but the $$ won't count towards the line spacing since they are added after!

So if you did:

fig,ax = plt.subplots(1,1,figsize = (10,7))
ax.text(.05,.9,TW.Wrap(Par1,Text_W),fontsize = 18,verticalalignment='top')

ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

You'd get: