This is linked to another question/code-golf i asked on Code golf: "Color highlighting" of repeated text
I've got a file 'sample1.txt' with the following content:
LoremIpsumissimplydummytextoftheprintingandtypesettingindustry.LoremIpsumhasbeentheindustry'sstandarddummytexteversincethe1500s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatypespecimenbook.
I've got a script generating the following array of strings which occur in the file (only a few shown for illustration):
LoremIpsum
LoremIpsu
dummytext
oremIpsum
LoremIps
dummytex
industry
oremIpsu
remIpsum
ummytext
LoremIp
dummyte
emIpsum
industr
mmytext
I need to (from the top) see if 'LoremIpsum' occurs in file sample1.txt. If so, I want to replace all occurences of LoremIpsum with: <T1>LoremIpsum</T1>
. Now, when the program moves to the next word 'LoremIpsu', it should NOT match against the <T1>LoremIpsum</T1>
text inside sample1.txt. It should repeat the above for all elements of this 'array'. The next 'valid' one would be 'dummytext' and that should be tagged as <T2>dummytext</T2>
.
I do think it should be possible to create a bash shell script solution for this rather than relying on perl/python/ruby programs.
Straightforward with Perl:
Sample run (wrapped to prevent horizontal scrolling):
You may object that all the
qr//
and@{[ ... ]}
business is on the baroque side. One could get the same effect with the/o
regular-expression switch as inPure Bash (no externals)
At the Bash command line: