How to parse a text file for string pattern and co

2019-08-01 01:04发布

I have a logfile that contains login data, and I need to generate a report that summarizes all of the failed login attempts and organize it by the user. A line from the file looks like:

Jan 21 19:22:23 localhost sshd[1234]: Failed password for USER from 127.0.0.1 port 12345 ssh2  #IPs and such obscured, obviously

And it's the USER from the line that I need to count and summarize. The pattern is always Failed password for USER so that helps, but I can't do awk -F or other string splitting stuff due to the amount of other junk on the line.

How can I count each failed login and total them up per user?

3条回答
Root(大扎)
2楼-- · 2019-08-01 01:23

Following awk may help you in same.

awk '/Failed password for/{gsub(/.*for | from.*/,"");a[$0]++} END{for(i in a){print i,a[i]}}'  Input_file

Adding a non-one liner form of solution too now.

awk '
/Failed password for/{
  gsub(/.*for | from.*/,"");
  a[$0]++
}
END{
  for(i in a){
    print i,a[i]}
}
'   Input_file
查看更多
我命由我不由天
3楼-- · 2019-08-01 01:23

Here's a perl solution:

perl -nle '$seen{$1}++ if /Failed password for (\S+) from /; END { print "$_: $seen{$_}" for sort keys %seen }'

The idea is to use a regex to extract the username from matching lines, use that to build a histogram in a hash (mapping usernames to counts), and print it all out at the end.

查看更多
SAY GOODBYE
4楼-- · 2019-08-01 01:43

With GNU grep, try this:

grep -Po "Failed password for \K.*?(?= from)" logfile.log | sort | uniq -c

-P enables perl regexes, allowing for things like \K.
-o Prints only the matched part, instead of whole lines that contain a match.
\K makes grep forget the part it matched before, so that it won't appear in the output.
.*? matches USER. Only this part will be printed.
(?= from) is a lookahead needed to determine when USER ends.

The grep part prints USER for every failed login attempt of USER. Now we only need to count the occurrences for each user. This is done with the idiom sort | uniq -c.

The final output looks like this:

      7 adam
      2 bob
     14 claire

The output is sorted by user names. To sort by the number of failed attempts, append | sort -nr to the command.

查看更多
登录 后发表回答