how to pass values dynamically in Apache NiFi from

2019-09-15 06:37发布

I have two tables one in mysql test.employee and other in hive default.dept I want to pass empid of test.employee table as a parameter to query in hive table and store data into HDFS

ExecuteSQL -> select empid from test.employee (gives 10 records)

SelectHiveQL -> SELECT * FROM default.dept where empid = ${empid} (should retrieve 10 records)

image description here

2条回答
对你真心纯属浪费
2楼-- · 2019-09-15 06:42

You could do the following:

  1. ExecuteSQL - to retrieve the employee records
  2. ConvertAvroToJson - for later processing of empid
  3. SplitJson - to split into one flow file per row
  4. EvaluateJsonPath - to get the value of empid into a flow file attribute
  5. ReplaceText - to set the content to the HiveQL statement (using expression language as you have done above)
  6. SelectHiveQL - to get the department records

Note that this executes a Hive SELECT for each of the empid values, so each execution of SelectHiveQL will produce a single record. I'm not sure (given HiveQL semantics for the IN clause, for example) how to get a single HiveQL statement since it's kind of a join between a "table of constants" and the Hive table, not to mention the NiFi processing is more complex as you won't want the SplitJson and would likely have to process all the records at once (with ExecuteScript, e.g.)

查看更多
贼婆χ
3楼-- · 2019-09-15 06:56
ExecuteSQL ->SplitAvro ->ConvertAvroToJson -> EvaluateJsonPath -> ReplaceText ->SelectHiveQL -> PutHDFS

ExecuteSQL       ==> Query -> select empid from employees
EvaluateHsonPath ==> destination ->flowfile-attribute
                 ==> empid = $.empid
ReplaceText      ==> ReplacementValue  -> ${empid}
SelectHiveQL   -> select * from dept where empid = ${empid}
查看更多
登录 后发表回答