My application generates logs and sends them to syslog-ng. I want to write a custom template/parser/filter for use in syslog-ng to correctly store the fields in tables of an SQLite database (MyDatabase).
This is the legend of my log:
unique-record-id usename date Quantity BOQ possible,item,profiles Count Vendor applicable,vendor,categories known,request,types vendor_code credit
All these 12 fields are tab separated, and the parser must store them into 12 columns of table MyTable1 in MyDatabase. Some of the fields: the 6th, 9th, and 10th however also contain "sub-fields" as comma-separated values. The number of values within each of these sub-fields, is variable, and can change in each line of log.
I need these fields to be stored in respective separate tables MyItem_type, MyVendor_groups, MyReqs
These "secondary" tables have 3 columns, record the Unique-Record-ID, and Quantity against each of their occurence in the log So the schema in MyItem_type table looks like:
Unique-Record-ID | item_profile | Quantity
Similarly the schema of MyVendor_groups looks like:
Unique-Record-ID | vendor_category | Quantity
and the schema of MyReqs looks like:
Unique-Record-ID | req_type | Quantity
Consider these sample lines from the log:
unique-record-id usename date Quantity BOQ possible,item,profiles Count Vendor applicable,vendor,categories known,request,types vendor_code credit
234.44.tfhj Sam 22-03-2016 22 prod1 cat1,cat22,cat36,cat44 66 ven1 t1,t33,t43,t49 req1,req2,req3,req4 blue 64.22
234.45.tfhj Alex 23-03-2016 100 prod2 cat10,cat36,cat42 104 ven1 t22,t45 req1,req2,req33,req5 red 66
234.44.tfhj Vikas 24-03-2016 88 prod1 cat101,cat316,cat43 22 ven2 t22,t43 req1,req23,req3,req6 red 77.12
234.47.tfhj Jane 25-03-2016 22 prod7 cat10,cat36,cat44 43 ven3 t77 req1,req24,req3,req7 green 45.89
234.48.tfhj John 26-03-2016 97 serv3 cat101,cat36,cat45 69 ven5 t1 req11,req2,req3,req8 orange 33.04
234.49.tfhj Ruby 27-03-2016 85 prod58 cat10,cat38,cat46 88 ven9 t33,t55,t99 req1,req24,req3,req9 white 46.04
234.50.tfhj Ahmed 28-03-2016 44 serv7 cat110,cat36,cat47 34 ven11 t22,t43,t77 req1,req20,req3,req10 red 43
My parser should store the above log into MyDatabase.Mytable1 as:
unique-record-id | usename | date | Quantity | BOQ | item_profile | Count | Vendor | vendor_category | req_type | vendor_code | credit
234.44.tfhj | Sam | 22-03-2016 | 22 | prod1 | cat1,cat22,cat36,cat44 | 66 | ven1 | t1,t33,t43,t49 | req1,req2,req3,req4 | blue | 64.22
234.45.tfhj | Alex | 23-03-2016 | 100 | prod2 | cat10,cat36,cat42 | 104 | ven1 | t22,t45 | req1,req2,req33,req5 | red | 66
234.44.tfhj | Vikas | 24-03-2016 | 88 | prod1 | cat101,cat316,cat43 | 22 | ven2 | t22,t43 | req1,req23,req3,req6 | red | 77.12
234.47.tfhj | Jane | 25-03-2016 | 22 | prod7 | cat10,cat36,cat44 | 43 | ven3 | t77 | req1,req24,req3,req7 | green | 45.89
234.48.tfhj | John | 26-03-2016 | 97 | serv3 | cat101,cat36,cat45 | 69 | ven5 | t1 | req11,req2,req3,req8 | orange | 33.04
234.49.tfhj | Ruby | 27-03-2016 | 85 | prod58 | cat10,cat38,cat46 | 88 | ven9 | t33,t55,t99 | req1,req24,req3,req9 | white | 46.04
234.50.tfhj | Ahmed | 28-03-2016 | 44 | serv7 | cat110,cat36,cat47 | 34 | ven11 | t22,t43,t77 | req1,req20,req3,req10 | red | 43
And also parse the "possible,item,profiles" to record into MyDatabase.MyItem_type as:
Unique-Record-ID | item_profile | Quantity
234.44.tfhj | cat1 | 22
234.44.tfhj | cat22 | 22
234.44.tfhj | cat36 | 22
234.44.tfhj | cat44 | 22
234.45.tfhj | cat10 | 100
234.45.tfhj | cat36 | 100
234.45.tfhj | cat42 | 100
234.44.tfhj | cat101 | 88
234.44.tfhj | cat316 | 88
234.44.tfhj | cat43 | 88
234.47.tfhj | cat10 | 22
234.47.tfhj | cat36 | 22
234.47.tfhj | cat44 | 22
234.48.tfhj | cat101 | 97
234.48.tfhj | cat36 | 97
234.48.tfhj | cat45 | 97
234.48.tfhj | cat101 | 97
234.48.tfhj | cat36 | 97
234.48.tfhj | cat45 | 97
234.49.tfhj | cat10 | 85
234.49.tfhj | cat38 | 85
234.49.tfhj | cat46 | 85
234.50.tfhj | cat110 | 44
234.50.tfhj | cat36 | 44
234.50.tfhj | cat47 | 44
We also need to similarly parse "applicable,vendor,categories" and store them into MyDatabase.MyVendor_groups. And parse "known,request,types" for storage into MyDatabase.MyReqs The first column for MyDatabase.MyItem_type, MyDatabase.MyVendor_groups and MyDatabase.MyReqs will always be the Unique-Record-ID that was witnessed in the log.
Therefore yes, this column does not contain unique data, like other columns, in these three tables. The third column will always be the Quantity that was witnessed in the log.
I know a bit of PCRE, but it is the use of nested parsers in syslog-ng that's completely confusing me.
Documentation of Syslog-ng suggests this is possible, but simply failed to get a good example. If any kind hack around here has some reference or sample to share, it will be so useful.
Thanks in advance.
I think all of these can be done using the csv-parser a few times. First, use a csv-parser with the tab delimiter("\t") to split the initial fields into named columns. Use this parser on the entire message. Then you'll have to parse the fields that have subfields using other instances of the csv-parser on the columns that need further parsing. You can find some examples at https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/csv-parser.html and https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/reference-parsers-csv.html
(It is possible that you can get it done with a single parser, if you specify both the tab and the comma as delimiters, but it might not work for the fields with variable number of fields.).