How do I trim whitespace?

2019-01-01 06:13发布

问题:

Is there a Python function that will trim whitespace (spaces and tabs) from a string?

Example: \\t example string\\texample string

回答1:

Whitespace on both sides:

s = \"  \\t a string example\\t  \"
s = s.strip()

Whitespace on the right side:

s = s.rstrip()

Whitespace on the left side:

s = s.lstrip()

As thedz points out, you can provide an argument to strip arbitrary characters to any of these functions like this:

s = s.strip(\' \\t\\n\\r\')

This will strip any space, \\t, \\n, or \\r characters from the left-hand side, right-hand side, or both sides of the string.

The examples above only remove strings from the left-hand and right-hand sides of strings. If you want to also remove characters from the middle of a string, try re.sub:

import re
print re.sub(\'[\\s+]\', \'\', s)

That should print out:

astringexample


回答2:

Python trim method is called strip:

str.strip() #trim
str.lstrip() #ltrim
str.rstrip() #rtrim


回答3:

For leading and trailing whitespace:

s = \'   foo    \\t   \'
print s.strip() # prints \"foo\"

Otherwise, a regular expression works:

import re
pat = re.compile(r\'\\s+\')
s = \'  \\t  foo   \\t   bar \\t  \'
print pat.sub(\'\', s) # prints \"foobar\"


回答4:

You can also use very simple, and basic function: str.replace(), works with the whitespaces and tabs:

>>> whitespaces = \"   abcd ef gh ijkl       \"
>>> tabs = \"        abcde       fgh        ijkl\"

>>> print whitespaces.replace(\" \", \"\")
abcdefghijkl
>>> print tabs.replace(\" \", \"\")
abcdefghijkl

Simple and easy.



回答5:

#how to trim a multi line string or a file

s=\"\"\" line one
\\tline two\\t
line three \"\"\"

#line1 starts with a space, #2 starts and ends with a tab, #3 ends with a space.

s1=s.splitlines()
print s1
[\' line one\', \'\\tline two\\t\', \'line three \']

print [i.strip() for i in s1]
[\'line one\', \'line two\', \'line three\']




#more details:

#we could also have used a forloop from the begining:
for line in s.splitlines():
    line=line.strip()
    process(line)

#we could also be reading a file line by line.. e.g. my_file=open(filename), or with open(filename) as myfile:
for line in my_file:
    line=line.strip()
    process(line)

#moot point: note splitlines() removed the newline characters, we can keep them by passing True:
#although split() will then remove them anyway..
s2=s.splitlines(True)
print s2
[\' line one\\n\', \'\\tline two\\t\\n\', \'line three \']


回答6:

No one has posted these regex solutions yet.

Matching:

>>> import re
>>> p=re.compile(\'\\\\s*(.*\\\\S)?\\\\s*\')

>>> m=p.match(\'  \\t blah \')
>>> m.group(1)
\'blah\'

>>> m=p.match(\'  \\tbl ah  \\t \')
>>> m.group(1)
\'bl ah\'

>>> m=p.match(\'  \\t  \')
>>> print m.group(1)
None

Searching (you have to handle the \"only spaces\" input case differently):

>>> p1=re.compile(\'\\\\S.*\\\\S\')

>>> m=p1.search(\'  \\tblah  \\t \')
>>> m.group()
\'blah\'

>>> m=p1.search(\'  \\tbl ah  \\t \')
>>> m.group()
\'bl ah\'

>>> m=p1.search(\'  \\t  \')
>>> m.group()
Traceback (most recent call last):
File \"<stdin>\", line 1, in <module>
AttributeError: \'NoneType\' object has no attribute \'group\'

If you use re.sub, you may remove inner whitespace, which could be undesirable.



回答7:

Whitespace includes space, tabs and CRLF. So an elegant and one-liner string function we can use is translate.

\' hello apple\'.translate(None, \' \\n\\t\\r\')

OR if you want to be thorough

import string
\' hello  apple\'.translate(None, string.whitespace)


回答8:

(re.sub(\' +\', \' \',(my_str.replace(\'\\n\',\' \')))).strip()

This will remove all the unwanted spaces and newline characters. Hope this help

import re
my_str = \'   a     b \\n c   \'
formatted_str = (re.sub(\' +\', \' \',(my_str.replace(\'\\n\',\' \')))).strip()

This will result :

\' a      b \\n c \' will be changed to \'a b c\'



回答9:

    something = \"\\t  please_     \\t remove_  all_    \\n\\n\\n\\nwhitespaces\\n\\t  \"

    something = \"\".join(something.split())

output: please_remove_all_whitespaces



回答10:

try translate

>>> import string
>>> print \'\\t\\r\\n  hello \\r\\n world \\t\\r\\n\'

  hello 
 world  
>>> tr = string.maketrans(string.whitespace, \' \'*len(string.whitespace))
>>> \'\\t\\r\\n  hello \\r\\n world \\t\\r\\n\'.translate(tr)
\'     hello    world    \'
>>> \'\\t\\r\\n  hello \\r\\n world \\t\\r\\n\'.translate(tr).replace(\' \', \'\')
\'helloworld\'


回答11:

If using Python 3: In your print statement, finish with sep=\"\". That will separate out all of the spaces.

EXAMPLE:

txt=\"potatoes\"
print(\"I love \",txt,\"\",sep=\"\")

This will print: I love potatoes.

Instead of: I love potatoes .

In your case, since you would be trying to get ride of the \\t, do sep=\"\\t\"



回答12:

Generally, I am using the following method:

>>> myStr = \"Hi\\n Stack Over \\r flow!\"
>>> charList = [u\"\\u005Cn\",u\"\\u005Cr\",u\"\\u005Ct\"]
>>> import re
>>> for i in charList:
        myStr = re.sub(i, r\"\", myStr)

>>> myStr
\'Hi Stack Over  flow\'

Note: This is only for removing \"\\n\", \"\\r\" and \"\\t\" only. It does not remove extra spaces.



回答13:

for removing whitespaces from the middle of the string

$p = \"ATGCGAC ACGATCGACC\";
$p =~ s/\\s//g;
print $p;

output: ATGCGACACGATCGACC



回答14:

This will remove all whitespace and newlines from both the beginning and end of a string:

>>> s = \"  \\n\\t  \\n   some \\n text \\n     \"
>>> re.sub(\"^\\s+|\\s+$\", \"\", s)
>>> \"some \\n text\"