How to schedule a sqoop action using oozie

2019-06-23 22:21发布

问题:

I am new to Oozie, Just wondering - How do I schedule a sqoop job using Oozie. I know sqoop action can be added as part of the Oozie workflow. But how can I schedule a sqoop action and get it running like every 2 mins or 8pm every day automatically (just lie a cron job)?

回答1:

You need to create coordinator.xml file with start, end and frequency. Here is an example

<coordinator-app name="example-coord" xmlns="uri:oozie:coordinator:0.2"

             frequency="${coord:days(7)}"
             start="${start}"
             end=  "${end}"

             timezone="America/New_York">

  <controls>
    <timeout>5</timeout>
  </controls>

  <action>
    <workflow>
        <app-path>${wf_application_path}</app-path>
    </workflow>
  </action>
</coordinator-app>

Then create coordinator.properties file like this one:

host=namenode01
nameNode=hdfs://${host}:8020

wf_application_path=${nameNode}/oozie/deployments/example
oozie.coord.application.path=${wf_application_path}

start=2013-07-13T07:00Z
end=2013-09-31T23:59Z

Upload your coordinator.xml file to hdfs and then submit your coordinator job with something like

oozie job -config coordinator.properties -run

Check the documentation http://oozie.apache.org/docs/3.3.2/CoordinatorFunctionalSpec.html it contains some examples.



回答2:

I think the following blog will be quite useful..

http://www.tanzirmusabbir.com/2013/05/chunk-data-import-incremental-import-in.html



标签: oozie