可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I've created and run a DataPrep job, and am trying to use the template from python on app engine. I can successfully start a job using

gcloud dataflow jobs run 
    --parameters "inputLocations={\"location1\":\"gs://bucket/folder/*\"},
outputLocations={\"location1\":\"project:dataset.table\"},
customGcsTempLocation=gs://bucket/DataPrep-beta/temp"
--gcs-location gs://bucket/DataPrep-beta/temp/cloud-dataprep-templatename_template

however trying to use python on app engine;

service = build('dataflow', 'v1b3', credentials=credentials)
input1  = {"location1": "{i1}".format(i1=input)}
output1 = {"location1": "{o1}".format(o1=output)}

print('input location: {}'.format(input1))

GCSPATH="gs://{bucket}/{template}".format(bucket=BUCKET, template=template)
BODY = {
    "jobName": "{jobname}".format(jobname=JOBNAME),
    "parameters": {
        "inputLocations":  input1,
        "outputLocations": output1,
        "customGcsTempLocation": "gs://{}/DataPrep-beta/temp".format(BUCKET)
     }
}

print("dataflow request body: {}".format(BODY))
request = service.projects().templates().launch(projectId=PROJECT, gcsPath=GCSPATH, body=BODY)
response = request.execute()

I get back;

"Invalid JSON payload received. Unknown name "location1" at 
  'launch_parameters.parameters[1].value': Cannot find field.
Invalid JSON payload received. Unknown name "location1" at 
  'launch_parameters.parameters[2].value': Cannot find field."

Nothing I've tried seems to support passing a dict or a json.dumps() or a str() to "inputLocations" or "outputLocations".

回答1:

The issue is with the format that you are passing input1 and output1. They need to be between quotation marks like this:

input1 = '{"location1":"' + input + '" }'
output1 = '{"location1":"' + output + '" }'

I have tried sending the request with the same approach than you and it fails. It also fails if I later parse it back to string or json because it doesn't parse quotes correctly.

回答2:

Surely the format is something to do with your problem. I had the same use case to solve, but the output would be the files, instead of google bigquery dataset. and for me, the code with the following BODY parameter is initiating the google dataflow pipeline:

BODY = {
        "jobName": "{jobname}".format(jobname=JOBNAME),
        "parameters": {
            "inputLocations" : "{{\"location1\":\"gs://{bucket}/employee/input/patient.json\"}}".format(bucket=BUCKET),
            "outputLocations": "{{\"location1\":\"gs://{bucket}/employee/employees.json/file\",\"location2\":\"gs://{bucket}/jobrun/employees_314804/.profiler/profilerTypeCheckHistograms.json/file\",\"location3\":\"gs://{bucket}/jobrun/employees_314804/.profiler/profilerValidValueHistograms.json/file\"}}".format(bucket=BUCKET)
         },
         "environment": {
            "tempLocation": "gs://{bucket}/employee/temp".format(bucket=BUCKET),
            "zone": "us-central1-f"
         }
    }