How do I launch a Cloud Dataflow job from a Google Cloud Function? I'd like to use Google Cloud Functions as a mechanism to enable cross-service composition.
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
I've included a very basic example of the WordCount sample below. Please note that you'll need to include a copy of the java binary in your Cloud Function deployment, since it is not in the default environment. Likewise, you'll need to package your deploy jar with your Cloud Function as well.
module.exports = {
wordcount: function (context, data) {
const spawn = require('child_process').spawn;
const child = spawn(
'jre1.8.0_73/bin/java',
['-cp',
'MY_JAR.jar',
'com.google.cloud.dataflow.examples.WordCount',
'--jobName=fromACloudFunction',
'--project=MY_PROJECT',
'--runner=BlockingDataflowPipelineRunner',
'--stagingLocation=gs://STAGING_LOCATION',
'--inputFile=gs://dataflow-samples/shakespeare/*',
'--output=gs://OUTPUT_LOCATION'
],
{ cwd: __dirname });
child.stdout.on('data', function(data) {
console.log('stdout: ' + data);
});
child.stderr.on('data', function(data) {
console.log('error: ' + data);
});
child.on('close', function(code) {
console.log('closing code: ' + code);
});
context.success();
}
}
You could further enhance this example by using the non-blocking runner and having the function return the Job ID, so that you can poll for job completion separately. This pattern should be valid for other SDKs as well, so long as their dependencies can be packaged into the Cloud Function.