Fault tolerance and log the incompatible rows in A

2019-09-20 12:33发布

问题:

Customer 's requirements:

  1. using use azure data factory to import csv file in blob storage to SQL data warehouse.

  2. using the strategy "Fault tolerance and log the incompatible rows in Azure Blob storage" in ADF.

  3. And using Azure Function to archive the processed file to other place in blob storage: one place for those files are imported successfully and one for fail files (the files have incompatible data - wrong format, wrong length)

=> so I need get value of skippedRowCount of Activity Window to know this activity which has some incompatible rows? Is there any ways to get that ways or any solution to solve my problem? Many thanks.

回答1:

In ADF V2, the number of skipped rows is returned as "rowsSkipped" property of copy activity output. See these two links: https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-overview#monitoring and https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-fault-tolerance#monitor-skipped-rows

ADF V2 also allows you to use the output from a previous copy activity in the subsequent activity, using an expression like "@activity('MyCopyActivity').output.rowsSkipped")". Here is an example of how to use output form Lookup activity, and you can adapt to your particular situation.

For you use case, you can chain the copy activity with two Web activities, one to invoke the file archive for successful files, and another to record/reprocess failed rows logged in Storage blob or ADLS.