Multiple wildcard directory/file search for arbitr

2019-07-14 05:17发布

问题:

I am building a Windows application in Visual Studio 2012 in C# and .net 4.0.

One of the functions of the program is to search for all files that meet a given search criteria, which usually (but doesn't always) include wildcards.

The search criteria is unknown at run time; it is imported from an excel spreadsheet.

Possible search criteria can include the following:

  1. Exact path
    • "C:\temp\directory1\directory2\someFile.txt"
  2. Path, with wildcards in the filename:
    • "C:\temp\directory1\directory2*.*"
  3. Filename, with wildcards in the path:
    • "C:\temp*\directory*\someFile.txt"
  4. Filename and path with wildcards:
    • "C:\temp\*\*\*.*"
  5. All of the above, with arbitrary directory structure:
    • "C:\temp\dir*1\dir*\anotherdir\*\another*\file*.txt"
    • "C:\te*\*\someFile.txt"
    • "C:\temp\*tory1\dire*2\*\*\*\*\*.*"

I've attempted to use Directory.EnumerateFiles:

IEnumerable<string> matchingFilePaths = System.IO.Directory.EnumerateFiles(@"C:\", selectedItemPath[0], System.IO.SearchOption.AllDirectories);

However this only works with Case 2 above. Attempting to use Directory.EnumerateFiles with wildcards in the folder names causes an "Illegal Character" exemption.

I'm hoping there's a one-liner in .net I can use to do this file searching. The number of wildcards and the depth of the directory structure is unknown at run time, and it's conceivable that the search might have to go a hundred folders deep, with every folder containing an unknown number of wildcards. (This is the crux of the problem). Trying to avoid having an insane number of nested for loops as well.

I read the solutions here but this doesn't seem to work for an arbitrary folder structure.

回答1:

Since you've already answered your own question I figured I'd post my attempt at it for any others who might find this and don't want to use powershell. Its all lazily loaded so its performance will be optimal in the case where you have a large file system and are matching lots of files.

Sample Usage:

string pattern = @"C:\Users\*\Source\Repos\*\*.cs";
foreach (var st in GetAllMatchingPaths(pattern))
    Console.WriteLine(st);

Solution:

public static IEnumerable<string> GetAllMatchingPaths(string pattern)
{
    char separator = Path.DirectorySeparatorChar;
    string[] parts = pattern.Split(separator);

    if (parts[0].Contains('*') || parts[0].Contains('?'))
        throw new ArgumentException("path root must not have a wildcard", nameof(parts));

    return GetAllMatchingPathsInternal(String.Join(separator.ToString(), parts.Skip(1)), parts[0]);
}

private static IEnumerable<string> GetAllMatchingPathsInternal(string pattern, string root)
{
    char separator = Path.DirectorySeparatorChar;
    string[] parts = pattern.Split(separator);

    for (int i = 0; i < parts.Length; i++)
    {
        // if this part of the path is a wildcard that needs expanding
        if (parts[i].Contains('*') || parts[i].Contains('?'))
        {
            // create an absolute path up to the current wildcard and check if it exists
            var combined = root + separator + String.Join(separator.ToString(), parts.Take(i));
            if (!Directory.Exists(combined))
                return new string[0];

            if (i == parts.Length - 1) // if this is the end of the path (a file name)
            {
                return Directory.EnumerateFiles(combined, parts[i], SearchOption.TopDirectoryOnly);
            }
            else // if this is in the middle of the path (a directory name)
            {
                var directories = Directory.EnumerateDirectories(combined, parts[i], SearchOption.TopDirectoryOnly);
                var paths = directories.SelectMany(dir =>
                    GetAllMatchingPathsInternal(String.Join(separator.ToString(), parts.Skip(i + 1)), dir));
                return paths;
            }
        }
    }

    // if pattern ends in an absolute path with no wildcards in the filename
    var absolute = root + separator + String.Join(separator.ToString(), parts);
    if (File.Exists(absolute))
        return new[] { absolute };

    return new string[0];
}

PS: It won't match directories, only files, but you could easily modify that if desired.



回答2:

After more searching, there turned out to be a very simple solution. I can use the Windows Powershell Get-ChildItem command:

using System.Management.Automation;

PowerShell ps = PowerShell.Create();
ps.AddCommand("Get-ChildItem");
ps.AddArgument(selectedItemPath[0]);
foreach (var result in ps.Invoke())
{
    //display the results in a textbox
}

This allows one to avoid nested for loops.