ftp to azure storage blob (triggered processing)

2019-07-08 23:41发布

问题:

I want to transfert encrypted files from an ftp server to `azure blob storage container

Here is the workfow in question:

CSV encrypted files on Ftp server ---------->Trigger(example: On adding files)----------> call to some local programs or api that process decryption and then create the output csv file in the blob container

the files are structured like following:

   Input CSV file:
        column1;column2;column3;
        encryptedvalue1;encryptedvalue2;encryptedvalue3;

and

Output csv file:
    column1;column2;column3;
    value1;value2;value3;

There is no file content transformation here but there is one more thing that i don't know if its doable or not:

I want to add the new blob under a specific folder depending of column1 value for example. (e.g. manage hierarchy of blob container from code)

I tried to create a Logic App and created the ftp trigger as first step, but i couldn't figure out what fits best as second step in my case.

I saw many suggestiong like using web jobs, others for azure functions and azure app service...

And because am kind of new to these structures of azure i came here to ask about the best way to do so and why ?

Is it better to use Web Job? or azure function ? or just make an HttpRequest ? and why is that ?

Am i already on the right way of doing this? is the logic app the best way that allows me to do so ?

Note: EDIT

files sizes are around some Mb (not very big sizes) CSV files with ";" as seperator the input is csv file on ftp server and the output is decrypted csv file under specific "folder" on azure blob storage

Any help will be appreciated

回答1:

There are a few key factors you should take into consideration while choosing between Azure Webjobs and Azure Function.

Azure Functions have two types of billing schemes: Consumption Plan and App Service Plan.

In consumption you pay only for a time when you function is running, however, under consumption plan, your function can't run more than 10 minutes. Which means, that if your jobs run more than 10 minutes Consumption plan is not for you.

App Service plan is the same plan used by Azure Web Jobs, you don't have time limitation here (as per documentation).

In general, Azure Functions are good when you need flexible logic with different triggers, etc.



回答2:

You can achieve this by logic app and function app as follows:

  1. create a ftp trigger(when the file arrives)
  2. If it is simple Encode Decode you can use corresponding shape or else you can create one Azure Function under consumption plan(for pricing according to the usage) which has encryption,decryption functionality where the data will be passed from FTP trigger shape. This requires coding you can develop it by VS Code or Visual studio.
  3. Then you can do parsing from the output of Azure Function using parse or you can use transform shape for your data formats(XML, JSON etc) and you can use decrypt again using Azure function which you wrote above just different methods inside the same function.
  4. Finally use the Blob shape to push the output of the Decryption to the blob storage Container.

Logic apps gives you wide usage of connectors making it easy to connect to different artefacts and a workflow approach, you can also use transformation with XSLT, liquid using integration account if needed.

Hope this helps Cheers!!



回答3:

Don't overengineer it.

Use a Logic App to poll the FTP server and detect new files, place them in blob storage.

Create a blob-triggered Azure Function (Consumption Plan, v2 runtime) and do your data transformation in code (in v2 you have a choice between TypeScript, JavaScript, C# and Python). Write the results to blob storage with a blob output binding.

OPTIONAL Have a second Logic App trigger on the resulting blobs and e-mail/text you notifications.



回答4:

I would recommend using the (Azure Function) or (Web Job)

Here are two patterns: - Using Docker Containers to perform a transform (copy in this case): https://azure.microsoft.com/en-us/blog/microsoft-azure-block-blob-storage-backup/ - Using a function to perform an operation after a blob created event: https://cmatskas.com/copy-azure-blob-data-between-storage-accounts-using-functions/

Please let me know if you have any additional questions.



回答5:

After some researches and based on the answer of evilSnobu and the comments of Johns-305, i figured out that the best way to do this is like following...

note: I have an Azure Api App developed to do content decryption

Based on this grid, the best choice here is obviously logic apps to design my workflow:

Inside my Logic App

  1. Create ftp trigger : When a files is added on ftp -> Create a blob on Azure storage & Delete file from ftp
  2. Create an Azure function

    (Azure function vs web jobs in the below grid)

    based on blob creation trigger, when a blob is created call decryption api app.

  3. For granularity reasons and to make azure function do only one elementary job, i have to create a second Azure function to do the file parsing and creation of by version-folders depending on version field content of the file

And based on the following grid, we can tell why azure Functions fit better than web jobs in my case

Finally, summarize this, i can say that in my case, i need to have a developer view of my solutions so that's why i needed the logic app mainly, then i have to do two elementary tasks which are trigger based not continious so that's better suited to Azure Functions and a lot cheaper (since files are not big and processing will be very quick)