How external clients notify Oozie workflow with HT

2019-05-12 01:23发布

问题:

Let us say we have a case where an Oozie workflow is started with 3 Java action nodes. Each Java action is going to make an async HTTP call to an external web services (such as some web service exposed by google.com, yahoo.com, etc.) outside the Oozie/Hadoop cluster. I assume this is doable since Oozie support custom action node.

Now, I don't want to have Oozie poll the external web services from time to time to check if the work is done in external web service. I want to have the external web service (let us assume we can modify that freely) call back Oozie to nofiy Oozie the work by external web service is done, and further pass some info back to Oozie and let Oozie decide which follow-up actions to take.

There are wikis, such as this http://www.infoq.com/articles/ExtendingOozie, talking about call back from action for async nodes, however I never have found any actual sample how call back for async action nodes works. Does anyone have any idea how does this call back for async action nodes work?

Many thanks in advance!

回答1:

Have a look at the Oozie SSH action implementation. It is just one class, relatively simple (but a bit messy), and shows how to create a callback URL:

String callbackPost = ignoreOutput ? "_" : 
getOozieConf().get(HTTP_COMMAND_OPTIONS).replace(" ", "%%%");

String callBackUrl = Services.get().get(CallbackService.class)
                .createCallBackUrl(action.getId(), EXT_STATUS_VAR);

The URL is then passed to a shell script as an argument. The script later just invokes curl on that URL. The external status id is for example PID of the executed process that is returned back to Oozie via the callback. It must not be empty/null.

Hints if you decide to delve into the code: although it looks like the code is executed synchronously, it is in fact executed asynchronously by means of running the shell script in the background.

The callback is processed by Oozie's CallbackServlet:

dagEngine.processCallback(actionId, callbackService.getExternalStatus(queryString), props);


回答2:

According to the IBM DW Article

When Oozie starts a task, it provides a unique callback HTTP URL to the task, and notifies that URL when it is complete. If the task fails to invoke the callback URL, Oozie can poll the task for completion.

The CallbackService.java has the methods for generating the callback url for sending it to th action and parsing it when the action makes a call back.