writing a custom template/parser/filter for use in

My application generates logs and sends them to syslog-ng. I want to write a custom template/parser/filter for use in syslog-ng to correctly store the fields in tables of an SQLite database (MyDatabase).

This is the legend of my log:

unique-record-id usename date Quantity BOQ possible,item,profiles Count Vendor applicable,vendor,categories known,request,types vendor_code credit

All these 12 fields are tab separated, and the parser must store them into 12 columns of table MyTable1 in MyDatabase. Some of the fields: the 6th, 9th, and 10th however also contain "sub-fields" as comma-separated values. The number of values within each of these sub-fields, is variable, and can change in each line of log.

I need these fields to be stored in respective separate tables MyItem_type, MyVendor_groups, MyReqs

These "secondary" tables have 3 columns, record the Unique-Record-ID, and Quantity against each of their occurence in the log So the schema in MyItem_type table looks like:

Unique-Record-ID | item_profile | Quantity

Similarly the schema of MyVendor_groups looks like:

Unique-Record-ID | vendor_category | Quantity

and the schema of MyReqs looks like:

Unique-Record-ID | req_type | Quantity

Consider these sample lines from the log:

unique-record-id usename date Quantity BOQ possible,item,profiles Count Vendor applicable,vendor,categories known,request,types vendor_code credit

234.44.tfhj Sam 22-03-2016  22  prod1   cat1,cat22,cat36,cat44  66  ven1    t1,t33,t43,t49  req1,req2,req3,req4 blue    64.22

234.45.tfhj Alex    23-03-2016  100 prod2   cat10,cat36,cat42   104 ven1    t22,t45 req1,req2,req33,req5    red 66

234.44.tfhj Vikas   24-03-2016  88  prod1   cat101,cat316,cat43 22  ven2    t22,t43 req1,req23,req3,req6    red 77.12

234.47.tfhj Jane    25-03-2016  22  prod7   cat10,cat36,cat44   43  ven3    t77 req1,req24,req3,req7    green   45.89

234.48.tfhj John    26-03-2016  97  serv3   cat101,cat36,cat45  69  ven5    t1  req11,req2,req3,req8    orange  33.04

234.49.tfhj Ruby    27-03-2016  85  prod58  cat10,cat38,cat46   88  ven9    t33,t55,t99 req1,req24,req3,req9    white   46.04

234.50.tfhj Ahmed   28-03-2016  44  serv7   cat110,cat36,cat47  34  ven11   t22,t43,t77 req1,req20,req3,req10   red 43

My parser should store the above log into MyDatabase.Mytable1 as:

unique-record-id    |   usename |   date    |   Quantity    |   BOQ |   item_profile    |   Count   |   Vendor  |   vendor_category |   req_type    |   vendor_code |   credit
234.44.tfhj |   Sam |   22-03-2016  |   22  |   prod1   |   cat1,cat22,cat36,cat44  |   66  |   ven1    |   t1,t33,t43,t49  |   req1,req2,req3,req4 |   blue    |   64.22
234.45.tfhj |   Alex    |   23-03-2016  |   100 |   prod2   |   cat10,cat36,cat42   |   104 |   ven1    |   t22,t45 |   req1,req2,req33,req5    |   red |   66
234.44.tfhj |   Vikas   |   24-03-2016  |   88  |   prod1   |   cat101,cat316,cat43 |   22  |   ven2    |   t22,t43 |   req1,req23,req3,req6    |   red |   77.12
234.47.tfhj |   Jane    |   25-03-2016  |   22  |   prod7   |   cat10,cat36,cat44   |   43  |   ven3    |   t77 |   req1,req24,req3,req7    |   green   |   45.89
234.48.tfhj |   John    |   26-03-2016  |   97  |   serv3   |   cat101,cat36,cat45  |   69  |   ven5    |   t1  |   req11,req2,req3,req8    |   orange  |   33.04
234.49.tfhj |   Ruby    |   27-03-2016  |   85  |   prod58  |   cat10,cat38,cat46   |   88  |   ven9    |   t33,t55,t99 |   req1,req24,req3,req9    |   white   |   46.04
234.50.tfhj |   Ahmed   |   28-03-2016  |   44  |   serv7   |   cat110,cat36,cat47  |   34  |   ven11   |   t22,t43,t77 |   req1,req20,req3,req10   |   red |   43

And also parse the "possible,item,profiles" to record into MyDatabase.MyItem_type as:

Unique-Record-ID | item_profile | Quantity
234.44.tfhj |   cat1    |   22
234.44.tfhj |   cat22   |   22
234.44.tfhj |   cat36   |   22
234.44.tfhj |   cat44   |   22
234.45.tfhj |   cat10   |   100
234.45.tfhj |   cat36   |   100
234.45.tfhj |   cat42   |   100
234.44.tfhj |   cat101  |   88
234.44.tfhj |   cat316  |   88
234.44.tfhj |   cat43   |   88
234.47.tfhj |   cat10   |   22
234.47.tfhj |   cat36   |   22
234.47.tfhj |   cat44   |   22
234.48.tfhj |   cat101  |   97
234.48.tfhj |   cat36   |   97
234.48.tfhj |   cat45   |   97
234.48.tfhj |   cat101  |   97
234.48.tfhj |   cat36   |   97
234.48.tfhj |   cat45   |   97
234.49.tfhj |   cat10   |   85
234.49.tfhj |   cat38   |   85
234.49.tfhj |   cat46   |   85
234.50.tfhj |   cat110  |   44
234.50.tfhj |   cat36   |   44
234.50.tfhj |   cat47   |   44

We also need to similarly parse "applicable,vendor,categories" and store them into MyDatabase.MyVendor_groups. And parse "known,request,types" for storage into MyDatabase.MyReqs The first column for MyDatabase.MyItem_type, MyDatabase.MyVendor_groups and MyDatabase.MyReqs will always be the Unique-Record-ID that was witnessed in the log.

Therefore yes, this column does not contain unique data, like other columns, in these three tables. The third column will always be the Quantity that was witnessed in the log.

I know a bit of PCRE, but it is the use of nested parsers in syslog-ng that's completely confusing me.

Documentation of Syslog-ng suggests this is possible, but simply failed to get a good example. If any kind hack around here has some reference or sample to share, it will be so useful.

Thanks in advance.

I think all of these can be done using the csv-parser a few times. First, use a csv-parser with the tab delimiter("\t") to split the initial fields into named columns. Use this parser on the entire message. Then you'll have to parse the fields that have subfields using other instances of the csv-parser on the columns that need further parsing. You can find some examples at https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/csv-parser.html and https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/reference-parsers-csv.html

(It is possible that you can get it done with a single parser, if you specify both the tab and the comma as delimiters, but it might not work for the fields with variable number of fields.).