I have two files, one is named NATLog with 3 columns and the other is Sourceports with 2 columns, Below is a sample of the NATLog file.
NATLog
14 172.18.2.12 445
50 172.18.24.4 123
80 10.2.123.37 22
68 172.18.1.37 25
I want to match the last column of the NATLog file with the First Column of the Sourceports file and append the associated service to the NATLog file as a 4th column
Sourceports
445 SMB
123 Network Time Protocol (NTP)
22 SSH
25 SMTP(Insecure)
Desired Output
14 172.18.2.12 445 SMB
50 172.18.24.4 123 Network Time Protocol (NTP)
80 10.2.123.37 22 SSH
68 172.18.1.37 25 SMTP(Insecure)
I am trying to learn AWK to accomplish this, but I am in need of some help, could you please assist me, thanks
This is why linux has a bunch of tiny tools such as
cat
,cut
,paste
and in this casejoin
.Join works on files where the column you try to
join
on is sorted.Sorted is actually a somewhat wrong wording here. It should be more like equivalently ordered. As you notice, both your files have the same input and output and the column you try to
join
on is equivalent. Sojoin
will work without a problem.If both files are not equivalently orderdered, you could use sort on it beforehand:
or if you just want to stick to a single program, than
awk
is the way forward:but if
natlog
andsource
don't have the same number of lines and/or keys, then you get the common part asTry awk,
Yet another in awk (well, two actually). This is for the perfect world:
Explained (and a bit expanded for unperfect world):
One possible output (shorter ip,
&
char in source and unmatched port 222):If your goal is the output formatting shown with the protocol column appended in an aligned fashion, then
printf
instead ofprint
provides the same fine-grained formatting control described inman 3 printf
(for the most part). In your case you simply need to get thelength()
of the port number field and subtract that from the desired total field-width to add that many spaces after the record fromNATLog
before appending the saved protocol fromSourceports
.You could do that similar to the following where a total field-width of 4 is used as an example:
Output
(note: your
Sourceports
cannot contain additional whitespace at the end of the records. If it does, then you will have to replace$0
with individual$1,$2,$3
and adjust the format-string accordingly)There are usually many ways to accomplish the same thing in
awk
so you can tailor it to meet whatever need you have.Using
paste
andawk
A shorter, but less efficient way would be to use both
paste
andawk
to achieve the same thing. (basically just outputting the first two fields ofNATLog
and appending the contents ofSourceports
withpaste
, e.g.(but that would really defeat the purpose of learning
awk
)