When I want to grep all the html files in some directory, I do the following
grep --include="*.html" pattern -R /some/path
which works well. The problem is how to grep all the html,htm,php files in some directory?
From this Use grep --exclude/--include syntax to not grep through certain files, it seems that I can do the following
grep --include="*.{html,php,htm}" pattern -R /some/path
But sadly, it would not work for me.
FYI, my grep version is 2.5.1.
You can use multiple --include
flags. This works for me:
grep -r --include=*.html --include=*.php --include=*.htm "pattern" /some/path/
However, you can do as Deruijter
suggested. This works for me:
grep -r --include=*.{html,php,htm} "pattern" /some/path/
Don't forget that you can use find
and xargs
for this sort of thing to:
find /some/path/ -name "*.htm*" -or -name "*.php" | xargs grep "pattern"
HTH
Using {html,php,htm}
can only work as a brace expansion, which is a nonstandard (not POSIX-compliant) feature of bash
, ksh
, and zsh
.
For a brace expansion to be recognized, it must be an unquoted (part of a) token on the command line.
A brace expansion expands to multiple arguments, so in the case at hand grep
ends up seeing multiple --include=...
options, just as if you had passed them individually.
The results of a brace expansion are subject to globbing (filename expansion), which has pitfalls:
Each resulting argument could further be expanded to matching filenames if it happens to contain unquoted globbing metacharacters such as *
.
While this is unlikely with tokens such as --include=*.html
(e.g., you'd have to have a file literally named something like --include=foo.html
for something to match), it is worth keeping in mind in general.
If the nullglob
shell option happens to be turned on (shopt -s nullglob
) and globbing matches nothing, the argument will be discarded.
Therefore, for a fully robust solution, use the following:
grep -R '--include=*.'{html,php,htm} pattern /some/path
'--include=*.'
is treated as a literal, due to being single-quoted; this prevents inadvertent interpretation of *
as a globbing character.
{html,php,htm}
, the - of necessity - unquoted brace expansion[1]
, expands to 3 arguments, which, due to {...}
directly following the '...'
token, include that token.
Therefore, after quote removal by the shell, the following 3 literal arguments are ultimately passed to grep
:
--include=*.html
--include=*.php
--include=*.htm
[1] More accurately, it's only the syntax-relevant parts of the brace expansion that must be unquoted, the list elements may still be individually quoted and must be if they contain globbing metacharacters that could result in unwanted globbing after the brace expansion; while not necessary in this case, the above could be written as
'--include=*.'{'html','php','htm'}
Try removing the double quotes
grep --include=*.{html,php,htm} pattern -R /some/path
is this not working?
grep pattern /some/path/*.{html,php,htm}
Try this.
-r will do a recursive search.
-s will suppress file not found errors.
-n will show you the line number of the file where the pattern is found.
grep "pattern" <path> -r -s -n --include=*.{c,cpp,C,h}
Use grep
with find
command
find /some/path -name '*.html' -o -name '*.htm' -o -name '*.php' -type f
-exec grep PATTERN {} \+
You can use -regex
and -regextype
options too.
It works for the same purpose, but without --include
option. It works on grep 2.5.1 as well.
grep -v -E ".*\.(html|htm|php)"