I need help to extract coincidences from a file.
I capture network traffic with tcpdump command
tcpdump -Xvv -i eth0 > captureFile.txt
Given any field of IP headers, TCP and Ethernet specify all values found in the captured traffic and count how many times that value for that field. For example if suppose TTL = 128 TTL = 64 then indicate how many packets have that field with each of these values.
The content of the file:
09:26:13.245546 IP (tos 0x0, ttl 1, id 3439, offset 0, flags [none], proto UDP (17), length 1018)
10.0.0.226.58935 > 239.255.255.250.3702: UDP, length 990
0x0000: 4500 03fa 0d6f 0000 0111 ada8 0a00 00e2 E....o..........
0x0010: efff fffa e637 0e76 03e6 7ec0 3c3f 786d .....7.v..~.<?xm
0x0020: 6c20 7665 7273 696f 6e3d 2231 2e30 2220 l.version="1.0".
0x0030: 656e 636f 6469 6e67 3d22 7574 662d 3822 encoding="utf-8"
0x0040: 3f3e 3c73 6f61 703a 456e 7665 ?><soap:Enve
09:26:13.339173 IP6 (hlim 1, next-header UDP (17) payload length: 998) fe80::21e9:f54b:9ae7:6383.58936 > ff02::c.3702: UDP, length 990
0x0000: 6000 0000 03e6 1101 fe80 0000 0000 0000 `...............
0x0010: 21e9 f54b 9ae7 6383 ff02 0000 0000 0000 !..K..c.........
0x0020: 0000 0000 0000 000c e638 0e76 03e6 666c .........8.v..fl
0x0030: 3c3f 786d 6c20 7665 7273 696f 6e3d 2231 <?xml.version="1
0x0040: 2e30 2220 656e 636f 6469 6e67 .0".encoding
09:26:13.407313 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.3.118 tell 10.0.1.215, length 46
0x0000: 0001 0800 0604 0001 0009 0fcb 0a0c 0a00 ................
0x0010: 01d7 0000 0000 0000 0a00 0376 0000 0000 ...........v....
0x0020: 0000 0000 0000 0000 0000 d9c4 62a8 ............b.
09:26:13.525954 IP (tos 0x0, ttl 128, id 3441, offset 0, flags [none], proto UDP (17), length 161)
10.0.0.226.59131 > 239.255.255.250.1900: UDP, length 133
0x0000: 4500 00a1 0d71 0000 0111 b0ff 0a00 00e2 E....q..........
0x0010: efff fffa e6fb 076c 008d 6fa6 4d2d 5345 .......l..o.M-SE
0x0020: 4152 4348 202a 2048 5454 502f 312e 310d ARCH.*.HTTP/1.1.
0x0030: 0a48 6f73 743a 3233 392e 3235 352e 3235 .Host:239.255.25
0x0040: 352e 3235 303a 3139 3030 0d0a 5.250:1900..
09:26:13.557002 IP (tos 0x0, ttl 1, id 3442, offset 0, flags [none], proto UDP (17), length 161)
10.0.0.226.59131 > 239.255.255.250.1900: UDP, length 133
0x0000: 4500 00a1 0d72 0000 0111 b0fe 0a00 00e2 E....r..........
0x0010: efff fffa e6fb 076c 008d 6fa6 4d2d 5345 .......l..o.M-SE
0x0020: 4152 4348 202a 2048 5454 502f 312e 310d ARCH.*.HTTP/1.1.
0x0030: 0a48 6f73 743a 3233 392e 3235 352e 3235 .Host:239.255.25
0x0040: 352e 3235 303a 3139 3030 0d0a 5.250:1900..
09:26:13.642734 IP (tos 0x0, ttl 1, id 21767, offset 0, flags [none], proto UDP (17), length 684)
10.0.0.237.58882 > 239.255.255.250.3702: UDP, length 656
0x0000: 4500 02ac 5507 0000 0111 6753 0a00 00ed E...U.....gS....
0x0010: efff fffa e602 0e76 0298 5568 3c3f 786d .......v..Uh<?xm
0x0020: 6c20 7665 7273 696f 6e3d 2231 2e30 2220 l.version="1.0".
0x0030: 656e 636f 6469 6e67 3d22 7574 662d 3822 encoding="utf-8"
0x0040: 3f3e 3c73 6f61 703a 456e 7665 ?><soap:Enve
09:26:13.642960 IP6 (hlim 1, next-header UDP (17) payload length: 664) fe80::b8a2:bd0:4e0b:1bb5.58883 > ff02::c.3702: UDP, length 656
0x0000: 6000 0000 0298 1101 fe80 0000 0000 0000 `...............
0x0010: b8a2 0bd0 4e0b 1bb5 ff02 0000 0000 0000 ....N...........
0x0020: 0000 0000 0000 000c e603 0e76 0298 248c ...........v..$.
0x0030: 3c3f 786d 6c20 7665 7273 696f 6e3d 2231 <?xml.version="
09:26:13.642999 IP (tos 0x0, ttl 64, id 21767, offset 0, flags [none], proto UDP (17), length 684)
10.0.0.237.58882 > 239.255.255.250.3702: UDP, length 656
0x0000: 4500 02ac 5507 0000 0111 6753 0a00 00ed E...U.....gS....
0x0010: efff fffa e602 0e76 0298 5568 3c3f 786d .......v..Uh<?xm
0x0020: 6c20 7665 7273 696f 6e3d 2231 2e30 2220 l.version="1.0".
0x0030: 656e 636f 6469 6e67 3d22 7574 662d 3822 encoding="utf-8"
0x0040: 3f3e 3c73 6f61 703a 456e 7665 ?><soap:Enve
The result must be:
ttl 64 - 1 time
ttl 128 - 1 time
ttl 1 - 3 times
Simple
awk
script:No need to waste sub-process.
I think this would be exactly same as your expected output.
output would be:
well not exactly same, since I didn't check time and times.. do you really need it? it could be done easily..
EDIT
as OP asks, output time/times depends on the count:
output:
It's a bit long and I'm sure it can be refactored quite a lot but it works if you don't|can't have
perl
installed:grep ttl captureFile.txt | awk '{print $5,$6}' | sed 's/,//' | sort | uniq -c | awk '{print $2,$3,"-",$1,"times"}'
Would get only the relevant parts of the text file.
Would get the formatting you wanted.
Two approaches:
If you have
perl
,Should do it. But I think
uniq -c
may also work withgrep
...And to get the exact output format you asked for, just add this after
uniq -c