I am facing few problems while extracting block of lines from a file. consider following two files
File-1
1.20/abc/this_is_test_1
perl/RRP/RRP-1.30/JEDI/JEDIExportSuccess2
exec perl/RRP/RRP-1.30/JEDI/CommonReq/confAbvExp
perl/LRP/BaseLibs/close-MMM
exec perl/LRP/BaseLibs/launchLRPCHURRTA("TYRE")
this/or/that
File-2
exec 1.20/setup/testird
exec 1.20/sql/temp/Test3
exec 1.20/setup/testxyz
exec 1.20/sql/fondle_opr_sql_labels
exec 1.20/setup/testird
exec 1.20/sql/temp/NEWTest
exec 1.20/setup/testxyz
exec 1.20/sql/fondle_opr_sql_xfer
exec 1.20/setup/testird
exec 1.20/sql/set_sec_not_0
exec 1.20/setup/testpqr
exec 1.20/sql/sql_ba_statuses_on_mult
exec perl/RRP/SetupReq/testdef_ijk
exec perl/RRP/RRP-1.30/JEDI/SetupReq/confAbvExp
exec perl/RRP/RRP-1.30/JEDI/JEDIExportSuccess1
exec perl/RRP/SetupReq/testdef_ijk
exec perl/RRP/RRP-1.30/JEDI/SetupReq/confAbvExp
exec perl/RRP/RRP-1.30/JEDI/JEDIExportSuccess2
exec perl/RRP/SetupReq/testdef_ijk
exec perl/RRP/RRP-1.30/JEDI/SetupReq/confAbvExp
exec perl/RRP/RRP-1.30/JEDI/JEDIExportSuccess3
exec 1.20/setup/testird
exec 1.20/sql/sqlmenu_purr_labl
exec 1.20/sql/est_time_at_non_drp_plc
exec 1.20/sql/half_Brd_Supply_mix_single
exec 1.20/setup/testird
exec 1.20/sql/temp/Test
exec 1.20/setup/testird
exec 1.20/sql/temp/Test2
exec perl/LRP/SetupReq/testird_LRP("LRP")
exec perl/BaseLibs/launch_client("LRP")
exec perl/LRP/LRP-classic-4.14/churrip/chorSingle
exec perl/LRP/BaseLibs/setupLRPMMMTab
exec perl/LRP/BaseLibs/launchMMM
exec perl/LRP/BaseLibs/launchLRPCHURRTA("TYRE")
#PAUSE Expand Churrip tree view & open all nodes
exec perl/LRP/LRP-classic-4.14/Corrugator/multipleSeriesWeb
exec perl/BaseLibs/ShutApp("Self Destruction System")
exec perl/LRP/BaseLibs/close-MMM
exec 1.20/setup/testmiddle
exec 1.20/sql/collective_reads
exec 1.20/setup/testinit
exec 1.20/abc/this_is_test_1
exec 1.20/abc/this_is_test_1
exec perl/LRP/SetupReq/abcDEF
exec perl/BaseLibs/launch_client("sqlC","LRP")
exec perl/LRP/LRP-perl-4.20/fireTrigger
Now for every line in File-1 i want to extract relevant block of lines from File-2. A block in File-2 is defined as below
exec 1.20/setup/xxxxx
blah blah blah
blah blah blah
.
.
.
all lines till next setup line is found
for example
exec 1.20/setup/testinit
exec 1.20/abc/this_is_test_1
exec 1.20/abc/this_is_test_1
or
exec perl/LRP/SetupReq/xxxxx
blah blah blah
blah blah blah
.
.
.
all lines till next setup line is found
for example
exec perl/LRP/SetupReq/testird_LRP("LRP")
exec perl/BaseLibs/launch_client("LRP")
exec perl/LRP/LRP-classic-4.14/churrip/chorSingle
exec perl/LRP/BaseLibs/setupLRPMMMTab
exec perl/LRP/BaseLibs/launchMMM
exec perl/LRP/BaseLibs/launchLRPCHURRTA("TYRE")
#PAUSE Expand Churrip tree view & open all nodes
exec perl/LRP/LRP-classic-4.14/Corrugator/multipleSeriesWeb
exec perl/BaseLibs/ShutApp("Self Destruction System")
exec perl/LRP/BaseLibs/close-MMM
I have so far managed to extract relevant blocks from File-2 with help of following script
Shell Script
#set -x
FLBATCHLIST=$1
BATCHFILE=$2
TEMPDIR="/usr/tmp/tempBatchDir"
rm -rf $TEMPDIR/*
WORKFILE="$TEMPDIR/failedTestList.txt"
CPBATCHFILE="$TEMPDIR/orig.test"
TESTSETFILE="$TEMPDIR/testset.txt"
TEMPFILE="$TEMPDIR/temp.txt"
DIFFFILE="$TEMPDIR/diff.txt"
#Output
FAILEDBATCH="$TEMPDIR/FailedBatch.test"
LOGFILE="$TEMPDIR/log.txt"
createBatch ()
{
TESTNAME=$1
#First process the $CPBATCHFILE to not have any blank lines, leading and trailing whitespaces
# delete BOTH leading and trailing whitespace from each line and blank lines from file
sed -i 's/^[[:space:]]*//;s/[[:space:]]*$//g;/^$/d' $CPBATCHFILE
FOUND=0
STATUS=1
while [ $STATUS -ne "0" ]
do
if [ ! -s $CPBATCHFILE ]; then
echo "$CPBATCHFILE is empty" >> $LOGFILE
STATUS=0
fi
awk '/[Ss]etup.*[Tt]est/ || /perl\/[[:alpha:]]*\/[Ss]etup[rR]eq/{if(b) exit; else b=1}1' $CPBATCHFILE > $TESTSETFILE
grep -i "$TESTNAME$" $TESTSETFILE >> $LOGFILE 2>&1
if [ $? -eq "0" ]; then
echo "test found" >> $LOGFILE
cat $TESTSETFILE >> $FAILEDBATCH
FOUND=1
fi
TSTFLLINES=`wc -l < $TESTSETFILE`
CPBTCHLINES=`wc -l < $CPBATCHFILE`
DIFF=`expr $CPBTCHLINES - $TSTFLLINES`
tail -n $DIFF $CPBATCHFILE > $DIFFFILE
mv $DIFFFILE $CPBATCHFILE
done
if [ $FOUND -eq 0 ]; then
echo $TESTNAME > $TEMPDIR/test.txt
ABSTEST=$(echo $TESTNAME | sed 's/\\//g')
echo "FATAL ERROR: Test \"$ABSTEST\" not found in batch" | tee -a $LOGFILE
fi
}
####STARTS HERE####
mkdir -p $TEMPDIR
#cat $TEMPDIR/test.txt
#FLBATCHLIST="$TEMPDIR/test.txt"
# delete run, BOTH leading and trailing whitespace and blank lines from file
sed 's/^[eE][xX][eE][cC]//g;s/^[[:space:]]*//;s/[[:space:]]*$//g;/^$/d' $FLBATCHLIST > $WORKFILE
# escaping special characters like '\' and '.' in the path names for better grepping
sed -i 's/\([\/\.\"]\)/\\\1/g' $WORKFILE
for fltest in $(cat $WORKFILE)
do
echo $fltest >> $LOGFILE
cp $BATCHFILE $CPBATCHFILE
createBatch $fltest
done
sed -i 's/\//\\/g' $FAILEDBATCH
## Clean up
cp $FAILEDBATCH .
THe problem with this script is
It takes some time as it traverses File-2 for each line of File-1. I wanted to know if there is any better solution where i just have to traverse File-2 once.
The script does solve my problem but I am left with file which has duplicate blocks of lines in it. I wanted to know is there a way to remove the duplicate blocks of lines.
This is my output when i execute the script
exec 1.20\setup\testinit
exec 1.20\abc\this_is_test_1
exec 1.20\abc\this_is_test_1
exec perl\RRP\SetupReq\testdef_ijk
exec perl\RRP\RRP-1.30\JEDI\SetupReq\confAbvExp
exec perl\RRP\RRP-1.30\JEDI\JEDIExportSuccess2
exec perl\RRP\SetupReq\testdef_ijk
exec perl\RRP\RRP-1.30\JEDI\SetupReq\confAbvExp
exec perl\RRP\RRP-1.30\JEDI\JEDIExportSuccess1
exec perl\RRP\SetupReq\testdef_ijk
exec perl\RRP\RRP-1.30\JEDI\SetupReq\confAbvExp
exec perl\RRP\RRP-1.30\JEDI\JEDIExportSuccess2
exec perl\RRP\SetupReq\testdef_ijk
exec perl\RRP\RRP-1.30\JEDI\SetupReq\confAbvExp
exec perl\RRP\RRP-1.30\JEDI\JEDIExportSuccess3
exec perl\LRP\SetupReq\testird_LRP("LRP")
exec perl\BaseLibs\launch_client("LRP")
exec perl\LRP\LRP-classic-4.14\churrip\chorSingle
exec perl\LRP\BaseLibs\setupLRPMMMTab
exec perl\LRP\BaseLibs\launchMMM
exec perl\LRP\BaseLibs\launchLRPCHURRTA("TYRE")
#PAUSE Expand Churrip tree view & open all nodes
exec perl\LRP\LRP-classic-4.14\Corrugator\multipleSeriesWeb
exec perl\BaseLibs\ShutApp("Self Destruction System")
exec perl\LRP\BaseLibs\close-MMM
exec perl\LRP\SetupReq\testird_LRP("LRP")
exec perl\BaseLibs\launch_client("LRP")
exec perl\LRP\LRP-classic-4.14\churrip\chorSingle
exec perl\LRP\BaseLibs\setupLRPMMMTab
exec perl\LRP\BaseLibs\launchMMM
exec perl\LRP\BaseLibs\launchLRPCHURRTA("TYRE")
#PAUSE Expand Churrip tree view & open all nodes
exec perl\LRP\LRP-classic-4.14\Corrugator\multipleSeriesWeb
exec perl\BaseLibs\ShutApp("Self Destruction System")
exec perl\LRP\BaseLibs\close-MMM
I tried searching for my answers over net but wasn't able to find one specific to my needs.
Given File-1 and File-2 Here is what i expect my script to output (I have listed what output i expect for each line in FILE-1)
For line "1.20/abc/this_is_test_1" in FILE-1
Output
exec 1.20/setup/testinit
exec 1.20/abc/this_is_test_1
exec 1.20/abc/this_is_test_1
For line "perl/RRP/RRP-1.30/JEDI/JEDIExportSuccess2" in FILE-1
Output
exec perl/RRP/SetupReq/testdef_ijk
exec perl/RRP/RRP-1.30/JEDI/SetupReq/confAbvExp
exec perl/RRP/RRP-1.30/JEDI/JEDIExportSuccess2
For line "exec perl/RRP/RRP-1.30/JEDI/CommonReq/confAbvExp" in FILE-1
Output
do nothing as there is no line matching this is in FILE-2
For line "perl/LRP/BaseLibs/close-MMM" in FILE-1
Output
exec perl/LRP/SetupReq/testird_LRP("LRP")
exec perl/BaseLibs/launch_client("LRP")
exec perl/LRP/LRP-classic-4.14/churrip/chorSingle
exec perl/LRP/BaseLibs/setupLRPMMMTab
exec perl/LRP/BaseLibs/launchMMM
exec perl/LRP/BaseLibs/launchLRPCHURRTA("TYRE")
#PAUSE Expand Churrip tree view & open all nodes
exec perl/LRP/LRP-classic-4.14/Corrugator/multipleSeriesWeb
exec perl/BaseLibs/ShutApp("Self Destruction System")
exec perl/LRP/BaseLibs/close-MMM
For line "exec perl/LRP/BaseLibs/launchLRPCHURRTA("TYRE")" in FILE-1
Output
Do nothing as it would generate the same black as line "perl/LRP/BaseLibs/close-MMM" in FILE-1 did
For Line "this/or/that" in FILE-1
Output
Do nothing as there is no line matching this is in FILE-2
SO my final output should be similiar (order of blocks doesn't matter) to
exec 1.20/setup/testinit
exec 1.20/abc/this_is_test_1
exec 1.20/abc/this_is_test_1
exec perl/RRP/SetupReq/testdef_ijk
exec perl/RRP/RRP-1.30/JEDI/SetupReq/confAbvExp
exec perl/RRP/RRP-1.30/JEDI/JEDIExportSuccess2
exec perl/LRP/SetupReq/testird_LRP("LRP")
exec perl/BaseLibs/launch_client("LRP")
exec perl/LRP/LRP-classic-4.14/churrip/chorSingle
exec perl/LRP/BaseLibs/setupLRPMMMTab
exec perl/LRP/BaseLibs/launchMMM
exec perl/LRP/BaseLibs/launchLRPCHURRTA("TYRE")
#PAUSE Expand Churrip tree view & open all nodes
exec perl/LRP/LRP-classic-4.14/Corrugator/multipleSeriesWeb
exec perl/BaseLibs/ShutApp("Self Destruction System")
exec perl/LRP/BaseLibs/close-MMM
It would be really great if anyone can give me some pointers on how to proceed. And yes i forgot to mention, this is not a homework question :-) .
Many thanks