How to replace multiple spaces with a single space

2020-02-11 08:27发布

问题:

I'd like to use bash to replace multiple adjacent spaces in a string by a single space. Example:

Original string:

"too         many       spaces."

Transformed string:

"too many spaces."

I've tried things like "${str//*( )/.}" or awk '{gsub(/[:blank:]/," ")}1' but I can't get it right.

Note: I was able to make it work with <CMD_THAT_GENERATES_THE_INPUT_STRINGH> | perl -lpe's/\s+/ /g' but I had to use perl to do the job. I'd like to use some bash internal syntax instead of calling an external program, if that is possible.

回答1:

Using tr:

$ echo "too         many       spaces." | tr -s ' '
too many spaces

man tr:

-s, --squeeze-repeats
       replace each sequence of a repeated character that is listed  in
       the last specified SET, with a single occurrence of that charac‐
       ter

Edit: Oh, by the way:

$ s="foo      bar"
$ echo $s
foo bar
$ echo "$s"
foo      bar

Edit 2: On the performance:

$ shopt -s extglob
$ s=$(for i in {1..100} ; do echo -n "word   " ; done) # 100 times: word   word   word...
$ time echo "${s//+([[:blank:]])/ }" > /dev/null

real    0m7.296s
user    0m7.292s
sys     0m0.000s
$ time echo "$s" | tr -s ' ' >/dev/null

real    0m0.002s
user    0m0.000s
sys     0m0.000s

Over 7 seconds?! How is that even possible. Well, this mini laptop is from 2014 but still. Then again:

$ time echo "${s//+( )/ }" > /dev/null

real    0m1.198s
user    0m1.192s
sys     0m0.000s


回答2:

Here is a way to do this using pure bash and extglob:

s="too         many       spaces."

shopt -s extglob
echo "${s//+([[:blank:]])/ }"

too many spaces.
  • Bracket expression [[:blank:]] matches a space or tab character
  • +([[:blank:]]) matches one or more of the bracket expression (requires extglob)


回答3:

Another simple sed expression using BRE is:

sed 's/[ ][ ]*/ /g'

For example:

$ echo "too         many       spaces." | sed 's/[ ][ ]*/ /g'
too many spaces.

There are a number of ways to skin the cat.

If the enclosed whitespace could consist of mixed spaces and tabs, then you could use:

sed 's/\s\s*/ /g'

And if you simply want to have bash word-splitting handle it, just echo your string without quotes, e.g.

$ echo "too         many       spaces." | while read line; do echo $line; done
too many spaces.

Continuing with that same thought, if your string with spaces is already stored in a variable, you can simply use echo unquoted within command substitution to have bash remove the additional whitespace for your, e.g.

$ foo="too         many       spaces."; bar=$(echo $foo); echo "$bar"
too many spaces.