Building a connection log system

2019-05-19 05:10发布

I'm building a "Smart" Log System, where I'm capable of monitoring customers connections, like, start and stop connection establishment time to server.

RAW LOG:

Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info <pppoe-customer1>: terminating... - peer is not responding
Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info,account customer1 logged out, 4486 1009521 23444247 12573 18159
Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info <pppoe-customer1>: disconnected
Dec 19 00:00:07 172.16.20.24 pppoe,info PPPoE connection established from 60:E3:27:A2:60:09
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info,account customer2 logged in, 10.171.3.185
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: authenticated
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: connected
Dec 19 00:00:13 172.16.20.24 pppoe,info PPPoE connection established from C0:25:E9:7F:C0:41
Dec 19 00:00:14 172.16.20.24 pppoe,ppp,error <ccfa>: user customer3 authentication failed
Dec 19 00:00:32 172.16.20.24 pppoe,info PPPoE connection established from C0:25:E9:7F:C0:41
Dec 19 00:00:36 172.16.20.24 pppoe,ppp,error <ccfb>: user customer3 authentication failed
Dec 19 00:01:06 172.16.20.24 pppoe,info PPPoE connection established from C0:25:E9:7F:C0:41

What are important for me: capture lines with connected and disconnected strings.

I got this:

import os
import re
import sys

f = open('log.log','r')
log = []
for line in f:
 if re.search(r': connected|: disconnected',line):
  ob = dict()
  ob['USER'] = re.search(r'<pppoe(.*?)>',line).group(0).replace("<pppoe-","").replace(">","")
  ob['DATA'] = re.search(r'^\w{3} \d{2} \d{2}:\d{2}:\d{2}',line).group(0)
  ob['CONNECTION'] = re.search(r': .*',line).group(0).replace(": ", "")
  log.append(ob)

I'm still learning, so that's not the most brilliant regex, but it's ok! Need now refine this log list, want to get to this sample:

{"connection" : [{
"start" : "Dec 19 10:12:58", 
"username" : "customer2"}

{"connection" : [{
"start" : "Dec 20 10:12:58", 
"username" : "customer1"}

{"connection" : [{
"start" : "Dec 19 10:12:58", 
"stop" : Dec 22 10:04:35",
"username" : "customer4"}

{"connection" : [{
"start" : "Dec 19 10:12:58",
"stop" : "Dec 24 10:04:35" 
"username" : "customer3"}

My obstacles,

  • The RAW Log is constantly being generated, I need to identify if some user already exists. IF YES: update connection (customer2 drops his connections, need registre it!) but What's happen if he has constants drop connections?

For example:

Dec 19 10:20:58 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: disconnected    
Dec 19 01:00:36 172.16.20.24 pppoe,ppp,error <ccfb>: user customer3 authentication failed
Dec 19 01:01:06 172.16.20.24 pppoe,info PPPoE connection established from C0:25:E9:7F:C0:41
Dec 19 10:21:38 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: authenticated
Dec 19 10:21:48 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: connected
Dec 19 10:22:38 172.16.20.24 pppoe,ppp,info <pppoe-customer3>: authenticated
Dec 19 10:22:58 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: disconnected  

First disconnection, simple to add it.

{"connection" : [{
"start" : "Dec 19 10:12:58"
"stop" : "Dec 19 10:20:58", 
"username" : "customer2"}

In the next authentication, I need to search this specific user, insert new "start" connection time and erase "stop". And so on.

{"connection" : [{
"start" : "Dec 19 10:21:48" 
"username" : "customer2"}
  • Next challenger to me, its create this new refine list.

Tried to make this, but does not work!

conn = []
for l in log:
 obcon = dict()
 if not obcon:
    obcon['USER'] = l['USER']
    if l['DATA'] == 'connected':
        obcon['START'] = l['DATA']      
        obcon['STOP'] = ""
    else:
        obcon['STOP'] = l['DATA']
 conn.append(obcon)

Before build the new list, I'd need to check if exists some user, if not, let's build it! The ['CONNECTION'] I use to identify starts/stop connections:

Disconnected -> STOP
Connected -> START

I dont know if I need to be more specific. Need ideas. Please!

1条回答
做个烂人
2楼-- · 2019-05-19 05:36

In my opinion, the var log should be of type dict as it will help you find an existing user data more easily.
Next, you used re(...).group(0) everywhere, which is the entire matching string. For example, when extracting the user name, you wrote '<pppoe(.*?)>', but it is located in group(1) (in regex, parentheses are used for match extraction).

My suggestion is (Note - I removed the imports of sys and os as they are not in use):

import re

f = open('log.log', 'r')
log = dict()
for line in f:
    reg = re.search(r': ((?:dis)?connected)', line) # finds connected or disconnected
    if reg is not None:
        user = re.search(r'<pppoe-(.*?)>', line).group(1)
        # if the user in the log, get it, else create it with empty dict
        ob = log.setdefault(user, dict({'USER': user})) 
        ob['CONNECTION'] = reg.group(1)
        time = re.search(r'^\w{3} \d{2} \d{2}:\d{2}:\d{2}', line).group(0)
        if ob['CONNECTION'].startswith('dis'):
            ob['END'] = time
        else:
            ob['START'] = time
            if 'END' in ob:
                ob.pop('END')

If the log file is:

Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info <pppoe-customer1>: terminating... - peer is not responding
Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info,account customer1 logged out, 4486 1009521 23444247 12573 18159
Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info <pppoe-customer1>: disconnected
Dec 19 00:00:07 172.16.20.24 pppoe,info PPPoE connection established from 00:00:00:00:00:00
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info,account customer2 logged in, 127.0.0.1
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: authenticated
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: connected
Dec 19 00:00:13 172.16.20.24 pppoe,info PPPoE connection established from 00:00:00:00:00:00
Dec 19 00:00:14 172.16.20.24 pppoe,ppp,error <ccfa>: user customer3 authentication failed
Dec 19 00:02:03 172.16.20.24 pppoe,ppp,info,account customer2 logged out, 4486 1009521 23444247 12573 18159
Dec 19 00:02:03 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: disconnected
Dec 19 00:02:08 172.16.20.24 pppoe,ppp,info,account customer3 logged in, 127.0.0.1
Dec 19 00:02:08 172.16.20.24 pppoe,ppp,info <pppoe-customer3>: authenticated
Dec 19 00:02:08 172.16.20.24 pppoe,ppp,info <pppoe-customer3>: connected

the value of log will be:

{
    'customer1': {
        'CONNECTION': 'disconnected',
        'END': 'Dec 19 00:00:03',
        'USER': 'customer1'
    }, 
    'customer3': {
        'START': 'Dec 19 00:02:08',
        'CONNECTION': 'connected',
        'USER': 'customer3'
    }, 
    'customer2': {
        'START': 'Dec 19 00:00:08',
        'CONNECTION': 'disconnected',
        'END': 'Dec 19 00:02:03', 
        'USER': 'customer2'
    }
}
查看更多
登录 后发表回答