Getting the latest file modified from Azure Blob

2019-02-21 12:47发布

Say I am generating a couple of json files each day in my blob storage. What I want to do is to get the latest file modified in any of my directories. So I'd have something like this in my blob:

2016/01/02/test.json
2016/01/02/test2.json
2016/02/03/test.json

I want to get 2016/02/03/test.json. So one way is getting the full path of the file and do a regex checking to find the latest directory created, but this doesn't work if I have more than one josn file in each dir. Is there anything like File.GetLastWriteTime to get the latest modified file? I am using these codes to get all the files btw:

public static CloudBlobContainer GetBlobContainer(string accountName, string accountKey, string containerName)
{
    CloudStorageAccount storageAccount = new CloudStorageAccount(new StorageCredentials(accountName, accountKey), true);
    // blob client
    CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
    // container
    CloudBlobContainer blobContainer = blobClient.GetContainerReference(containerName);
    return blobContainer;
}

public static IEnumerable<IListBlobItem> GetBlobItems(CloudBlobContainer container)
{
    IEnumerable<IListBlobItem> items = container.ListBlobs(useFlatBlobListing: true);
    return items;
}

public static List<string> GetAllBlobFiles(IEnumerable<IListBlobItem> blobs)
{
    var listOfFileNames = new List<string>();

    foreach (var blob in blobs)
    {
        var blobFileName = blob.Uri.Segments.Last();
        listOfFileNames.Add(blobFileName);
    }
    return listOfFileNames;
}

4条回答
\"骚年 ilove
2楼-- · 2019-02-21 12:57

Each IListBlobItem is going to be a CloudBlockBlob, a CloudPageBlob, or a CloudBlobDirectory.

After casting to block or page blob, or their shared base class CloudBlob (preferably by using the as keyword and checking for null), you can access the modified date via blockBlob.Properties.LastModified.

Note that your implementation will do an O(n) scan over all blobs in the container, which can take a while if there are hundreds of thousands of files. There's currently no way of doing a more efficient query of blob storage though, (unless you abuse the file naming and encode the date in such a way that newer dates alphabetically come first). Realistically if you need better query performance I'd recommend keeping a database table handy that represents all the file listings as rows, with things like an indexed DateModified column to search by and a column with the blob path for easy access to the file.

查看更多
叼着烟拽天下
3楼-- · 2019-02-21 12:57

Use the Azure Web Jobs SDK. The SDK has options to monitor for new/updated BLOBs.

查看更多
狗以群分
4楼-- · 2019-02-21 13:00

Like Yar said, you can use the LastModified property of an individual blob object. Here is a code snippet that shows how to do that, once you have a reference to the correct container:

var latestBlob = container.ListBlobs()
    .OfType<CloudBlockBlob>()
    .OrderByDescending(m => m.Properties.LastModified)
    .ToList()
    .First();

Note: The blob type may not be <CloudBlockBlob>. Be sure to change that if necessary.

查看更多
5楼-- · 2019-02-21 13:00

In case of issue use blockBlob.Container.Properties.LastModified

查看更多
登录 后发表回答