How to remove trailing whitespace of all files rec

2019-01-16 00:46发布

问题:

How can you remove all of the trailing whitespace of an entire project? Starting at a root directory, and removing the trailing whitespace from all files in all folders.

Also, I want to to be able to modify the file directly, and not just print everything to stdout.

回答1:

Here is an OS X >= 10.6 Snow Leopard solution.

It Ignores .git and .svn folders and their contents. Also it won't leave a backup file.

export LC_CTYPE=C
export LANG=C
find . -not \( -name .svn -prune -o -name .git -prune \) -type f -print0 | xargs -0 sed -i '' -E "s/[[:space:]]*$//"


回答2:

Use:

find . -type f -print0 | xargs -0 perl -pi.bak -e 's/ +$//'

if you don't want the ".bak" files generated:

find . -type f -print0 | xargs -0 perl -pi -e 's/ +$//'

as a zsh user, you can omit the call to find, and instead use:

perl -pi -e 's/ +$//' **/*

Note: To prevent destroying .git directory, try adding: -not -iwholename '*.git*'.



回答3:

Two alternative approaches which also work with DOS newlines (CR/LF) and do a pretty good job at avoiding binary files:

Generic solution which checks that the MIME type starts with text/:

while IFS= read -r -d '' -u 9
do
    if [[ "$(file -bs --mime-type -- "$REPLY")" = text/* ]]
    then
        sed -i 's/[ \t]\+\(\r\?\)$/\1/' -- "$REPLY"
    else
        echo "Skipping $REPLY" >&2
    fi
done 9< <(find . -type f -print0)

Git repository-specific solution by Mat which uses the -I option of git grep to skip files which Git considers to be binary:

git grep -I --name-only -z -e '' | xargs -0 sed -i 's/[ \t]\+\(\r\?\)$/\1/'


回答4:

In Bash:

find dir -type f -exec sed -i 's/ *$//' '{}' ';'

Note: If you're using .git repository, try adding: -not -iwholename '.git'.



回答5:

This worked for me in OSX 10.5 Leopard, which does not use GNU sed or xargs.

find dir -type f -print0 | xargs -0 sed -i.bak -E "s/[[:space:]]*$//"

Just be careful with this if you have files that need to be excluded (I did)!

You can use -prune to ignore certain directories or files. For Python files in a git repository, you could use something like:

find dir -not -path '.git' -iname '*.py'


回答6:

Ack was made for this kind of task.

It works just like grep, but knows not to descend into places like .svn, .git, .cvs, etc.

ack --print0 -l '[ \t]+$' | xargs -0 -n1 perl -pi -e 's/[ \t]+$//'

Much easier than jumping through hoops with find/grep.

Ack is available via most package managers (as either ack or ack-grep).

It's just a Perl program, so it's also available in a single-file version that you can just download and run. See: Ack Install



回答7:

ex

Try using Ex editor (part of Vim):

$ ex +'bufdo!%s/\s\+$//e' -cxa **/*.*

Note: For recursion (bash4 & zsh), we use a new globbing option (**/*.*). Enable by shopt -s globstar.

You may add the following function into your .bash_profile:

# Strip trailing whitespaces.
# Usage: trim *.*
# See: https://stackoverflow.com/q/10711051/55075
trim() {
  ex +'bufdo!%s/\s\+$//e' -cxa $*
}

sed

For using sed, check: How to remove trailing whitespaces with sed?

find

Find the following script (e.g. remove_trail_spaces.sh) for removing trailing whitespaces from the files:

#!/bin/sh
# Script to remove trailing whitespace of all files recursively
# See: https://stackoverflow.com/questions/149057/how-to-remove-trailing-whitespace-of-all-files-recursively

case "$OSTYPE" in
  darwin*) # OSX 10.5 Leopard, which does not use GNU sed or xargs.
    find . -type f -not -iwholename '*.git*' -print0  | xargs -0 sed -i .bak -E "s/[[:space:]]*$//"
    find . -type f -name \*.bak -print0 | xargs -0 rm -v
    ;;
  *)
    find . -type f -not -iwholename '*.git*' -print0 | xargs -0 perl -pi -e 's/ +$//'
esac

Run this script from the directory which you want to scan. On OSX at the end, it will remove all the files ending with .bak.

Or just:

find . -type f -name "*.java" -exec perl -p -i -e "s/[ \t]$//g" {} \;

which is recommended way by Spring Framework Code Style.



回答8:

I ended up not using find and not creating backup files.

sed -i '' 's/[[:space:]]*$//g' **/*.*

Depending on the depth of the file tree, this (shorter version) may be sufficient for your needs.

NOTE this also takes binary files, for instance.



回答9:

Instead of excluding files, here is a variation of the above the explicitly white lists the files, based on file extension, that you want to strip, feel free to season to taste:

find . \( -name *.rb -or -name *.html -or -name *.js -or -name *.coffee -or \
-name *.css -or -name *.scss -or -name *.erb -or -name *.yml -or -name *.ru \) \
-print0 | xargs -0 sed -i '' -E "s/[[:space:]]*$//"


回答10:

I ended up running this, which is a mix between pojo and adams version.

It will clean both trailing whitespace, and also another form of trailing whitespace, the carriage return:

find . -not \( -name .svn -prune -o -name .git -prune \) -type f \
  -exec sed -i 's/[:space:]+$//' \{} \;  \
  -exec sed -i 's/\r\n$/\n/' \{} \;

It won't touch the .git folder if there is one.

Edit: Made it a bit safer after the comment, not allowing to take files with ".git" or ".svn" in it. But beware, it will touch binary files if you've got some. Use -iname "*.py" -or -iname "*.php" after -type f if you only want it to touch e.g. .py and .php-files.

Update 2: It now replaces all kinds of spaces at end of line (which means tabs as well)



回答11:

This works well.. add/remove --include for specific file types :

egrep -rl ' $' --include *.c *  | xargs sed -i 's/\s\+$//g'


回答12:

Ruby:

irb
Dir['lib/**/*.rb'].each{|f| x = File.read(f); File.write(f, x.gsub(/[ \t]+$/,"")) }


回答13:

1) Many other answers use -E. I am not sure why, as that's undocumented BSD compatibility option. -r should be used instead.

2) Other answers use -i ''. That should be just -i (or -i'' if preffered), because -i has the suffix right after.

3) Git specific solution:

git config --global alias.check-whitespace \
'git diff-tree --check $(git hash-object -t tree /dev/null) HEAD'

git check-whitespace | grep trailing | cut -d: -f1 | uniq -u -z | xargs -0 sed --in-place -e 's/[ \t]+$//'

The first one registers a git alias check-whitespace which lists the files with trailing whitespaces. The second one runs sed on them.

I only use \t rather than [:space:] as I don't typically see vertical tabs, form feeds and non-breakable spaces. Your measurement may vary.



回答14:

This is what works for me (Mac OS X 10.8, GNU sed installed by Homebrew):

find . -path ./vendor -prune -o \
  \( -name '*.java' -o -name '*.xml' -o -name '*.css' \) \
  -exec gsed -i -E 's/\t/    /' \{} \; \
  -exec gsed -i -E 's/[[:space:]]*$//' \{} \; \
  -exec gsed -i -E 's/\r\n/\n/' \{} \;

Removed trailing spaces, replaces tabs with spaces, replaces Windows CRLF with Unix \n.

What's interesting is that I have to run this 3-4 times before all files get fixed, by all cleaning gsed instructions.