
C# - How to list the files in a sub-directory fast

2019-06-07 00:59发布


I am trying to list the files in all the sub-directories of a root directory with the below approach. But its taking much time when the number of files are in millions. Is there any better approach of doing this.

I am using .NET 3.5 so can't use enumerator :-(

        ******************* Main *************
        DirectoryInfo dir = new DirectoryInfo(path);
        DirectoryInfo[] subDir = dir.GetDirectories();
        foreach (DirectoryInfo di in subDir) //call for each sub directory
             PopulateList(di.FullName, false);

        static void PopulateList(string directory, bool IsRoot)

            System.Diagnostics.ProcessStartInfo procStartInfo = new System.Diagnostics.ProcessStartInfo("cmd", "/c " + "dir /s/b \"" + directory + "\"");
            procStartInfo.RedirectStandardOutput = true;
            procStartInfo.UseShellExecute = false;
            procStartInfo.CreateNoWindow = true;
            System.Diagnostics.Process proc = new System.Diagnostics.Process();
            proc.StartInfo = procStartInfo;

            string fileName = directory.Substring(directory.LastIndexOf('\\') + 1);
            StreamWriter writer = new StreamWriter(fileName + ".lst");

            while (proc.StandardOutput.EndOfStream != true)


Remove all Process-related stuff and try out Directory.GetDirectories () and Directory.GetFiles() methods:

public IEnumerable<string> GetAllFiles(string rootDirectory)
    foreach(var directory in Directory.GetDirectories(
        foreach(var file in Directory.GetFiles(directory))
            yield return file;

From MSDN, SearchOption.AllDirectories:

Includes the current directory and all the subdirectories in a search operation. This option includes reparse points like mounted drives and symbolic links in the search.


It will be definitely faster to use DirectoryInfo.GetFiles in a loop for each directory instead of spawning tons of new processes to read thier output.


With millions of files you're actually running into filesystem limitation (see this and search for "300,000"), so take this into account.

As for optimizations, I think you'd really want to iterate lazily, so you'll have to P/Invoke into FindFirstFile/FindNextFile.


Check out already available Directory.GetFiles overload.
For example:

var paths = Directory.GetFiles(root, "*", SearchOption.AllDirectories);

And yes it will take a lot of time. But I don't think that you can increase its performance using only .Net classes.


Assuming that your millions of files are spread across multiple sub-directories and you're using .NET 4.0, you could look at the parallel extensions.

Using a parallel foreach loop to process the list of sub-directories, could make things a lot faster.

The new parallel extensions are also a lot safer and easier to use than attempting multi-threading at a lower-level.

The one thing to look out for is making sure that you limit the number of concurrent processes to something sensible.