Execute U-SQL script in ADL storage from Data Fact

2019-07-29 17:44发布

问题:

I have a USQL script stored on my ADL store and I am trying to execute it. the script file is quite big - about 250Mb.

So far i have a Data Factory, I have created a Linked Service and am trying to create a Data lake Analytics U-SQL Activity.

The code for my U-SQL Activity looks like this:

{
"name": "RunUSQLScript1",
"properties": {
    "description": "Runs the USQL Script",
    "activities": [
        {
            "name": "DataLakeAnalyticsUSqlActivityTemplate",
            "type": "DataLakeAnalyticsU-SQL",
            "linkedServiceName": "AzureDataLakeStoreLinkedService",

            "typeProperties": {

                "scriptPath": "/Output/dynamic.usql",
                "scriptLinkedService": "AzureDataLakeStoreLinkedService",
                "degreeOfParallelism": 3,
                "priority": 1000
            },
            "policy": {
                "concurrency": 1,
                "executionPriorityOrder": "OldestFirst",
                "retry": 3,
                "timeout": "01:00:00"
            },
            "scheduler": {
                "frequency": "Day",
                "interval": 1
            }
        }
    ],
    "start": "2017-05-02T00:00:00Z",
    "end": "2017-05-02T00:00:00Z"
}

}

However, I get the following error:

Error

Activity 'DataLakeAnalyticsUSqlActivityTemplate' from >pipeline 'RunUSQLScript1' has no output(s) and no schedule. Please add an >output dataset or define activity schedule.

What i would like is to have this Activity run on-demand, i.e. I do not want it scheduled at all, and also I do not understand what Inputs and Outputs are in my case. The U-SQL Script I am trying to run is operating on millions of files on my ADL storage and is saving them after some modifiction of the content.

回答1:

Currently ADF does not support running USQL script stored in ADLS for a USQL activity, i.e. the "scriptLinkedService" under "typeProperties" has to be an Azure Blob Storage Linked Service. We will update the documentation for USQL activity to make this more clear.

Supporting running USQL script stored in ADLS is on our product backlog, but we don't have a committed date for this yet.

Shirley Wang



回答2:

Currently ADF does not support executing the activity on-demand and it needs to be configured with a schedule. You will need at least one output to drive the schedule execution of the activity. The output can be a dummy Azure Storage one without actually write the data out but ADF leverages the availability properties to drive the schedule execution. For example:

{
 "name": "OutputDataset",
 "properties": {
     "type": "AzureBlob",
     "linkedServiceName": "AzureStorageLinkedService",
     "typeProperties": {
         "fileName": "dummyoutput.txt",
         "folderPath": "adf/output",
         "format": {
             "type": "TextFormat",
             "columnDelimiter": "\t"
         }
     },
     "availability": {
         "frequency": "Day",
         "interval": 1
     }
 }
}