Search files and run a script on every result

2019-09-04 10:12发布

问题:

I would like to know how to search certain pattern of files in all Sub Directories ( Month wise / Date wise - Sub Directories created). And then, execute a script on the found files.

Step1: For example: currently searching files on this pattern TT_DETAIL*.gz.

find /cygdrive/c/Test/  -name TT_DETAIL*.gz

output#1:

/cygdrive/c/Test/Feb2014/TT_DETAIL_20141115.csv.gz
/cygdrive/c/Test/Jan2014/TT_DETAIL_20141110.csv.gz
/cygdrive/c//Test/Mar2014/TT_DETAIL_20141120.csv.gz

Step2:

zcat TT_DETAIL*.gz | awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17}' >Op_TT_Detail.txt

cat Op_TT_Detail.txt

01-NOV-2014 00:30:53,AAA,ECH,1,10,XXX
01-NOV-2014 00:39:36,BBB,ECH,1,10,XXX
01-NOV-2014 00:39:44,CCC,ECH,1,10,XXX
01-NOV-2014 00:25:20,DDD,ECH,1,10,XXX

Have tried to combine the below commands like this, but the steps are in-complete:

find /cygdrive/c/Test/  -name TT_DETAIL*.gz | xargs | zcat | awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17}' >>Op_TT_DETAIL.txt

find /cygdrive/c/Test/  -name TT_DETAIL*.gz -exec ./MonthwiseData.sh {} \;

Where cat MonthwiseData.sh

zcat *.gz | awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17}' >>Op_TT_Detail.txt

How can I do this?

回答1:

I think a while loop is the best way to handle these situations:

while IFS= read -r file
do
   awk '...' <(zcat "$file")
done < <(find . -type f -name "*gz")

You have a find command that is sent to a while loop. This way, you can process each file separatedly.

Then, it is a matter of performing a basic awk '...' <(zcat "$file") or zcat "$file" | awk '...'.

In your case:

while IFS= read -r file
do
   awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17}' <(zcat "$file") >>Op_TT_Detail.txt
done < <(find /cygdrive/c/Test/  -name TT_DETAIL*.gz)

Test

We have some gz files in the current directory:

$ for f in *gz; do echo "-- $f --"; zcat "$f"; done-- a.gz --
hello
bye
-- b.gz --
thisisB
bye

Let's find them and print just the first field on the first line:

$ while IFS= read -r file; do awk 'NR==1{print $1}' <(zcat "$file") >> output; done < <(find . -type f -name "*gz")

And the output is:

$ cat output 
thisisB
hello

I think you are looking for something like this:

find /cygdrive/c/Test/ -name "TT_DETAIL*.gz" -print0 | \
  xargs -0 -I file zcat file | \
  awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17}' >>Op_TT_Detail.txt
  • find finds files and -print0 prints its name.
  • xargs -0 allows you to handle what is coming from the previous pipe. With -I file we name it file, so that we can then do zcat file | awk.

Interesting reading: xargs: How To Control and Use Command Line Arguments.



回答2:

You can enclose the find command in tick marks to create an argument list, like:

awk '{print $0}' `find . -type f -name 'file*'` > concat_files.txt

A simple example doing the same as concatenating all files starting with "file" really. It is just the tick marks I want to emphasize here.