Find extra space / new line after a closing ?>

2020-02-10 17:36发布

So I have a space/new line after a closing ?> (php tag) that is breaking my application.

How can I find it easily I have 1000 of files and 100000 lines of code in this app.

Ideally im after some regex combined with find grep to run on a unix box.

9条回答
Summer. ? 凉城
2楼-- · 2020-02-10 17:59

This is possible with regular grep

grep -Pz '\?>[\s]+$' -Rl

Will search for all files starting from the current directory and list all that have a ?> followed by white space at the end of the file.

  • -P Interpret the pattern as a Perl-compatible regular expression (PCRE).
  • -z Treats the input file as one long line - this is in part what makes it work
  • [\s]+ matches at least one white space - including newlines

If you want to match PHP files only:

find -name '*.php' | xargs grep -Pz '\?>[\s]+$' -l

To search for white space at the beginning of the file before

find -name '*.php' | xargs grep -Pz '^[\s]+<\?' -l
查看更多
乱世女痞
3楼-- · 2020-02-10 17:59

This works on my box:

for i in `find . -name "*.php"`; do (echo -n "$i: "; tail -c 3 $i) | grep -v "[?]>"; done

The idea is that you take just the last 3 characters with tail, then discard the files where those are '?', '>' and newline. If there's a space or another newline, you won't get the '?' character..

查看更多
萌系小妹纸
4楼-- · 2020-02-10 18:05

The problem here is normal grep doesn't match multiple lines. So, I would install pcregrep and try the following command:

pcregrep -rMl '\?>[\s\n]+\z' *

This will match all files in the folder and subfolders (the -r part) using PCRE multiline match (the -M part), and only list their filenames (the -l part).

As for the pattern, well that matches ?> followed by 1 or more whitespace or newline characters, followed by the end of the file \z. I found though, when I ran this on my folder, many of the PHP files do in fact end with a single newline. So you can update that regex to be '\?>[\s\n]+\n\z' to match files with whitespace over and above the single \n character terminator.

Lastly, you can always use od -c filename to print unambiguous representation of the file if you need to check its exact character sequence ending.

查看更多
混吃等死
5楼-- · 2020-02-10 18:05

use perl;

perl -0777 -i -pe 's/\s*$//s' *.php
  • -0777 will slurp he whole file (-0 will be ok too)
  • -i - inplace editing, so the file will be replaces with the result
  • -p print lines
  • -e perl expression

s/\s*$//s - treat all lines as a single line and substitute any space at the end to nothing

查看更多
▲ chillily
6楼-- · 2020-02-10 18:11

This worked for me to find white spaces before php files

find -name '*.php' | xargs grep -Pz '\?>[\s]+$' -l
查看更多
一夜七次
7楼-- · 2020-02-10 18:17

grep '?> ' *.php? Of course, it may not be a space and could be a linebreak or a tab, so you may want to try other characters.

查看更多
登录 后发表回答