How to upload zip file to azure blob then unzip it

2019-06-20 01:39发布

问题:

I have lot of zip files, which will have few folders and 50+ files in it. How can I upload those zip files to azure blob then unzip it there.

Unzipping the file in server and uploading files in it one by one to azure blob will be a cumbersome process.

Does azure has any easy way to achieve this or is there any workaround?

I'm implementing this in PHP.

回答1:

Simple answer is Azure Blob Storage will not do the unzipping for you. This is something you would need to do on your own. How you do it is up to you.

One possibility is (like you mentioned) that you upload zip files on your server, unzip them there and then upload individual files.

Another possibility is to do this unzipping through a background process if you are concerned about the processing happening on the web server. In this approach, you will simply upload the zip files in blob storage. Then through some background process (could be WebJobs, Functions, Worker Roles, or Virtual Machines), you would download these zip files, unzip them and then re-upload these individual files.

To trigger the background process on demand, once the zip file is uploaded you could simply write a message in a queue telling background process to download the zip file and start unzipping process.



回答2:

As you prob. already found all over the internet, it's no possible to run workloads INSIDE of the storage servers... but: You can write a azure function to FileWatch your storage account, and unzip files for you, then upload them



回答3:

AS @Gaurav mentions, unzipping is not naively supported. There was a feedback item to include this as a feature but it was declined. I can think of two alternatives that may be of interest.

1) Build an Azure Data Factory custom activity that does the unzipping. As files are uploaded to a temporary location, you can then unzip then in a pipeline and write them to your application container. This will require a batch service instance but Data Factory will take care of all the orchestration and give you a management facility to alert for failures etc.

2) Move your blobs from Azure Blob Storage to Azure Data Lake Store using adlcopy.exe. Once in Data Lake Storage, you can then build your own custom extractor and query the zip/gzip files. After another look through the documentation it does seem that USQL may be able to do this natively. Look for the section Extraction from compressed data in the EXTRACT expression.

3) Use PolyBase with SQL Data Warehouse which can read zip/gzip files natively. This is the easiest but probably the most expensive option. See CREATE EXTERNAL TABLE and CREATE EXTERNAL FILE FORMAT.

4) And as @EvertonMc just mentioned, you could do it with an Azure function on a trigger, which is also a good option.

Good luck and let us know how you get on.