Bash and filenames with spaces

The following is a simple Bash command line:

grep -li 'regex' "filename with spaces" "filename"

No problems. Also the following works just fine:

grep -li 'regex' $(<listOfFiles.txt)

where listOfFiles.txt contains a list of filenames to be grepped, one filename per line.

The problem occurs when listOfFiles.txt contains filenames with embedded spaces. In all cases I've tried (see below), Bash splits the filenames at the spaces so, for example, a line in listOfFiles.txt containing a name like ./this is a file.xml ends up trying to run grep on each piece (./this, is, a and file.xml).

I thought I was a relatively advanced Bash user, but I cannot find a simple magic incantation to get this to work. Here are the things I've tried.

grep -li 'regex' `cat listOfFiles.txt`

Fails as described above (I didn't really expect this to work), so I thought I'd put quotes around each filename:

grep -li 'regex' `sed -e 's/.*/"&"/' listOfFiles.txt`

Bash interprets the quotes as part of the filename and gives "No such file or directory" for each file (and still splits the filenames with blanks)

for i in $(<listOfFiles.txt); do grep -li 'regex' "$i"; done

This fails as for the original attempt (that is, it behaves as if the quotes are ignored) and is very slow since it has to launch one 'grep' process per file instead of processing all files in one invocation.

The following works, but requires some careful double-escaping if the regular expression contains shell metacharacters:

eval grep -li 'regex' `sed -e 's/.*/"&"/' listOfFiles.txt`

Is this the only way to construct the command line so it will correctly handle filenames with spaces?

标签： bash command-line

6条回答

三岁会撩人

2楼-- · 2019-01-08 13:07

cat listOfFiles.txt |tr '\n' '\0' |xargs -0 grep -li 'regex'

The -0 option on xargs tells xargs to use a null character rather than white space as a filename terminator. The tr command converts the incoming newlines to a null character.

This meets the OP's requirement that grep not be invoked multiple times. It has been my experience that for a large number of files avoiding the multiple invocations of grep improves performance considerably.

This scheme also avoids a bug in the OP's original method because his scheme will break where listOfFiles.txt contains a number of files that would exceed the buffer size for the commands. xargs knows about the maximum command size and will invoke grep multiple times to avoid that problem.

A related problem with using xargs and grep is that grep will prefix the output with the filename when invoked with multiple files. Because xargs invokes grep with multiple files one will receive output with the filename prefixed, but not for the case of one file in listOfFiles.txt or the case of multiple invocations where the last invocation contains one filename. To achieve consistent output add /dev/null to the grep command:

cat listOfFiles.txt |tr '\n' '\0' |xargs -0 grep -i 'regex' /dev/null

Note that was not an issue for the OP because he was using the -l option on grep; however it is likely to be an issue for others.

0人赞添加讨论(0) 举报

看我几分像从前

3楼-- · 2019-01-08 13:13

Try this:

(IFS=$'\n'; grep -li 'regex' $(<listOfFiles.txt))

IFS is the Internal Field Separator. Setting it to $'\n' tells Bash to use the newline character to delimit filenames. Its default value is $' \t\n' and can be printed using cat -etv <<<"$IFS".

Enclosing the script in parenthesis starts a subshell so that only commands within the parenthesis are affected by the custom IFS value.

0人赞添加讨论(0) 举报

混吃等死

4楼-- · 2019-01-08 13:15

Do note that if you somehow ended up with a list in a file which has Windows line endings, \r\n, NONE of the notes above about the input file separator $IFS (and quoting the argument) will work; so make sure that the line endings are correctly \n (I use scite to show the line endings, and easily change them from one to the other).

Also cat piped into while file read ... seems to work (apparently without need to set separators):

cat <(echo -e "AA AA\nBB BB") | while read file; do echo $file; done

... although for me it was more relevant for a "grep" through a directory with spaces in filenames:

grep -rlI 'search' "My Dir"/ | while read file; do echo $file; grep 'search\|else' "$ix"; done

0人赞添加讨论(0) 举报

唯我独甜

5楼-- · 2019-01-08 13:17

Though it may overmatch, this is my favorite solution:

grep -i 'regex' $(cat listOfFiles.txt | sed -e "s/ /?/g")

0人赞添加讨论(0) 举报

ら.Afraid

6楼-- · 2019-01-08 13:22

This works:

while read file; do grep -li dtw "$file"; done < listOfFiles.txt

0人赞添加讨论(0) 举报

爱情/是我丢掉的垃圾

7楼-- · 2019-01-08 13:23

With Bash 4, you can also use the builtin mapfile function to set an array containing each line and iterate on this array:

$ tree
.
├── a
│   ├── a 1
│   └── a 2
├── b
│   ├── b 1
│   └── b 2
└── c
    ├── c 1
    └── c 2

3 directories, 6 files
$ mapfile -t files < <(find -type f)
$ for file in "${files[@]}"; do
> echo "file: $file"
> done
file: ./a/a 2
file: ./a/a 1
file: ./b/b 2
file: ./b/b 1
file: ./c/c 2
file: ./c/c 1

0人赞添加讨论(0) 举报

Bash and filenames with spaces

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间