I would like to know how to search certain pattern of files in all Sub Directories ( Month wise / Date wise - Sub Directories created). And then, execute a script on the found files.
Step1: For example: currently searching files on this pattern TT_DETAIL*.gz
.
find /cygdrive/c/Test/ -name TT_DETAIL*.gz
output#1:
/cygdrive/c/Test/Feb2014/TT_DETAIL_20141115.csv.gz
/cygdrive/c/Test/Jan2014/TT_DETAIL_20141110.csv.gz
/cygdrive/c//Test/Mar2014/TT_DETAIL_20141120.csv.gz
Step2:
zcat TT_DETAIL*.gz | awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17}' >Op_TT_Detail.txt
cat Op_TT_Detail.txt
01-NOV-2014 00:30:53,AAA,ECH,1,10,XXX
01-NOV-2014 00:39:36,BBB,ECH,1,10,XXX
01-NOV-2014 00:39:44,CCC,ECH,1,10,XXX
01-NOV-2014 00:25:20,DDD,ECH,1,10,XXX
Have tried to combine the below commands like this, but the steps are in-complete:
find /cygdrive/c/Test/ -name TT_DETAIL*.gz | xargs | zcat | awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17}' >>Op_TT_DETAIL.txt
find /cygdrive/c/Test/ -name TT_DETAIL*.gz -exec ./MonthwiseData.sh {} \;
Where cat MonthwiseData.sh
zcat *.gz | awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17}' >>Op_TT_Detail.txt
How can I do this?
I think a while
loop is the best way to handle these situations:
while IFS= read -r file
do
awk '...' <(zcat "$file")
done < <(find . -type f -name "*gz")
You have a find
command that is sent to a while
loop. This way, you can process each file separatedly.
Then, it is a matter of performing a basic awk '...' <(zcat "$file")
or zcat "$file" | awk '...'
.
In your case:
while IFS= read -r file
do
awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17}' <(zcat "$file") >>Op_TT_Detail.txt
done < <(find /cygdrive/c/Test/ -name TT_DETAIL*.gz)
Test
We have some gz
files in the current directory:
$ for f in *gz; do echo "-- $f --"; zcat "$f"; done-- a.gz --
hello
bye
-- b.gz --
thisisB
bye
Let's find
them and print just the first field on the first line:
$ while IFS= read -r file; do awk 'NR==1{print $1}' <(zcat "$file") >> output; done < <(find . -type f -name "*gz")
And the output is:
$ cat output
thisisB
hello
I think you are looking for something like this:
find /cygdrive/c/Test/ -name "TT_DETAIL*.gz" -print0 | \
xargs -0 -I file zcat file | \
awk 'BEGIN { FS=OFS=","} { if ($11=="10") print $2,$3,$6,$10,$11,$17}' >>Op_TT_Detail.txt
find
finds files and -print0
prints its name.
xargs -0
allows you to handle what is coming from the previous pipe. With -I file
we name it file
, so that we can then do zcat file | awk
.
Interesting reading: xargs: How To Control and Use Command Line Arguments.
You can enclose the find command in tick marks to create an argument list, like:
awk '{print $0}' `find . -type f -name 'file*'` > concat_files.txt
A simple example doing the same as concatenating all files starting with "file" really. It is just the tick marks I want to emphasize here.