find string inside a gzipped file in a folder

2020-02-08 03:56发布

My current problem is that I have around 10 folders, which contain gzipped files (around on an average 5 each). This makes it 50 files to open and look at.

Is there a simpler method to find out if a gzipped file inside a folder has a particular pattern or not?

zcat ABC/myzippedfile1.txt.gz | grep "pattern match"
zcat ABC/myzippedfile2.txt.gz | grep "pattern match"

Instead of writing a script, can I do the same in a single line, for all the folders and sub folders?

for f in `ls *.gz`; do echo $f; zcat $f | grep <pattern>; done;

7条回答
我欲成王,谁敢阻挡
2楼-- · 2020-02-08 04:18

Coming in a bit late on this, had a similar problem and was able to resolve using;

zcat -r /some/dir/here | grep "blah"

As detailed here;

http://manpages.ubuntu.com/manpages/quantal/man1/gzip.1.html

However, this does not show the original file that the result matched from, instead showing "(standard input)" as it's coming in from a pipe. zcat does not seem to support outputting a name either.

In terms of performance, this is what we got;

$ alias dropcache="sync && echo 3 > /proc/sys/vm/drop_caches"

$ find 09/01 | wc -l
4208

$ du -chs 09/01
24M

$ dropcache; time zcat -r 09/01 > /dev/null
real    0m3.561s

$ dropcache; time find 09/01 -iname '*.txt.gz' -exec zcat '{}' \; > /dev/null
0m38.041s

As you can see, using the find|zcat method is significantly slower than using zcat -r when dealing with even a small volume of files. I was also unable to make zcat output the file name (using -v will apparently output the filename, but not on every single line). It would appear that there isn't currently a tool that will provide both speed and name consistency with grep (i.e. the -H option).

If you need to identify the name of the file that the result belongs to, then you'll need to either write your own tool (could be done in 50 lines of Python code) or use the slower method. If you do not need to identify the name, then use zcat -r.

Hope this helps

查看更多
Rolldiameter
3楼-- · 2020-02-08 04:23

use the find command

find . -name "*.gz" -exec zcat "{}" + |grep "test"

or try using the recursive option (-r) of zcat

查看更多
霸刀☆藐视天下
4楼-- · 2020-02-08 04:23

zgrep will look in gzipped files, has a -R recursive option, and a -H show me the filename option:

zgrep -R --include=*.gz -H "pattern match" .
查看更多
太酷不给撩
5楼-- · 2020-02-08 04:31

zgrep "string" ./*/*

You can use above command to search for string in .gz files of dir directory where dir has following sub-directories structure:

/dir
    /childDir1
              /file1.gz
              /file2.gz
    /childDir2
              /file3.gz
              /file4.gz
    /childDir3
              /file5.gz
              /file6.gz
查看更多
Lonely孤独者°
6楼-- · 2020-02-08 04:35

find . -name "*.gz"|xargs zcat | grep "pattern" should do.

查看更多
等我变得足够好
7楼-- · 2020-02-08 04:41

You don't need zcat here because there is zgrep and zegrep.

If you want to run a command over a directory hierarchy, you use find:

find . -name "*.gz" -exec zgrep ⟨pattern⟩ \{\} \;

And also “ls *.gz” is useless in for and you should just use “*.gz” in the future.

查看更多
登录 后发表回答