How to extract a subset from a CSV file using NiFi

2019-03-04 02:30发布

I have a csv file say with 100+ columns and I want to extract only specific 60 columns as a subset(both column name + its value). I know we can use Extract Text processors. Can anyone tell me what regular expression to write? Ex- Lets say from the given snapshot I only want NiFi to Extract 'BMS_sw_micro', 'BMU_Dbc_Dbg_Micro', 'BMU_Dbc_Fia_Micro' columns i.e. Extract only column 'F,L,O'.

any help is much appreciated!

SampleCSV

2条回答
虎瘦雄心在
2楼-- · 2019-03-04 03:19

See my answer to this SO question to your related question about selecting CSV columns.

查看更多
混吃等死
3楼-- · 2019-03-04 03:31

As I said in the comment, you can Count the number of commas before the text, you want to match and use that in the RegEx, like this:

/(?<=^([^,]+?,){5})[^,]+/

What the RegEx do is, it starts from left of string and Counts the number of commas, before it matches text between 2 commas.

The number in the curly braces defines what column to match (how many commas to skip).

You run the RegEx once for every column, you want, specifying the column number.

查看更多
登录 后发表回答