I've been looking at a lot of posts and haven't quite found what I'm looking for. I'm not sure how to go about taking the following sample data:
host1 input nic1 ip1 ip2 PROT 30000 10
host1 input nic1 ip1 ip2 PROT 40000 10
host1 input nic1 ip1 ip2 PROT 50000 10
host1 input nic1 ip1 ip2 PROT 60000 10
host1 input nic1 ip3 ip2 PROT 10 30000
host1 input nic1 ip3 ip2 PROT 10 40000
host1 input nic1 ip3 ip2 PROT 10 50000
host1 input nic1 ip3 ip2 PROT 10 60000
host1 output nic1 ip2 ip1 PROT 10 30000
host1 output nic1 ip2 ip1 PROT 10 40000
host1 output nic1 ip2 ip1 PROT 10 50000
host1 output nic1 ip2 ip1 PROT 10 60000
host1 output nic1 ip2 ip3 PROT 30000 10
host1 output nic1 ip2 ip3 PROT 40000 10
host1 output nic1 ip2 ip3 PROT 50000 10
host1 output nic1 ip2 ip3 PROT 60000 10
host1 output loc ip2 ip2 PROT 10 30000
host1 output loc ip2 ip2 PROT 10 50000
And merge it into:
host1 input nic1 ip1 ip2 PROT 30000:60000 10
host1 input nic1 ip3 ip2 PROT 10 30000:60000
host1 output nic1 ip2 ip1 PROT 10 30000:60000
host1 output nic1 ip2 ip3 PROT 30000:60000 10
host1 output loc ip2 ip2 PROT 10 30000:50000
I have a large amount of data like this with the need to make ranges for multiple fields of a given line but I think if somebody can show me how to do it for one field as I have above, I should be able to figure the rest out. And if not I'll follow up :). Thanks in advance for any help.
Update
I have refactored the code in the answer below so as to make it more readable. The main body should read almost English prose.
Original answer
The following
awk
script generates almost what you are looking for.Essentially the script does the following:
veryold
the first line of each set of lines that differ only for the 7th and/or 8th filedold
the last read lineb
is used to check when that last line is surpassedveryold
are joined with those ofold
with a:
in between if they are different, andold
is printed\t
is used between the last two fields to improve readabilityOther two points:
NR == 1
is a special case that has to initializeveryold
onlyEND
handles the special case of the last line stored inold