How to get a list of existing directories in Azure

2019-06-22 06:30发布

问题:

I have a console app written using C# on the top of Core .NET 2.2 framework.

I am trying to the C# library to get a list of all directories inside my container. It is my understanding that Azure Blob Storage does not really have directories. Instead, it creates virtual names that the blobs look like a folder inside a container within browsers like Azure Blob Explorer

I store my files using the following code

CloudBlockBlob blockBlob = container.GetBlockBlobReference("foldername/filename.jpg");

await blockBlob.UploadFromStreamAsync(stream);

So I want to select a distinct list of the prefix aka folder names on inside my container.

So if I have the following blobs "foldername1/file1.jpg", "foldername1/file2.jpg", "foldername1/file3.jpg", and "foldername2/file1.jpg". I want to return "foldername1", "foldername2"

How can I get a list of distinct prefixes from Azure Blob Storage?

Updated

I tried to get the feedback from the comments below so I came up with the following code

public async Task<string[]> Directories(string path = null)
{
    int index = path == null ? 0 : path.Split('/', StringSplitOptions.RemoveEmptyEntries).Length;

    BlobContinuationToken token = null;
    List<string> directories = new List<string>();
    do
    {
        BlobResultSegment blobsListingResult = await ContainerFactory.Get().ListBlobsSegmentedAsync(path ?? string.Empty, true, BlobListingDetails.None, 5000, token, null, null);
        token = blobsListingResult.ContinuationToken;
        IEnumerable<IListBlobItem> blobsList = blobsListingResult.Results;
        foreach (var item in blobsList)
        {
            var blobName = (item as CloudBlob).Name;
            var blobParts = blobName.Split('/', StringSplitOptions.RemoveEmptyEntries);

            if (blobParts.Length <= index)
            {
                // At this point, we know that this not a directory inside the provided path directory
                continue;
            }

            directories.Add(blobParts[index]);
        }
    }
    while (token != null);

    return directories.Distinct().ToArray();
}

Since I have lots of blobs in the container, this takes way too long because it would have to almost get every single block in order to get a list of the directories. Additionally, this may be very costly since I have to read every blob every time this method is called.

I essentially need the same result that I would get as running Directory.GetDirectories(path) if everything was running locally! Is there a way to improve this function?

回答1:

The best way I've found to do this is to not treat the Blob Storage like a folder/file store. Keep the files (blobs) there, but use some other method to track your folder structure. My method of choice is a SQL database that contains my folder structure, and then a reference to a blob file in Azure. The problem with calling all this code directly in Azure is that:

a) It'll be slow b) It'll give you unnecessary cost in the long run

You're far better off doing as I suggest, keep metadata elsewhere, and use Blob storage for what its intended for - storing blobs