-->

GetDirectories fails to enumerate subfolders of a

2020-04-18 06:00发布

问题:

My application is C# 3.5 runs on Windows 7 Ultimate, 64 bit. It goes through all folder subfolders to perform its job. However, it fails (falls into the infinite loop until StackOverflow.com exception) if run against the folder which name is only one symbol which is #255.

To reproduce, you can do the following:

  1. run Windows Explorer create C:\Temp folder in this folder
  2. create new folder and rename it with Alt-255 (using numeric keypad)
  3. create subfolders "first" and "second" there
  4. create subfolders "1" and "2" under Temp

So you now have:

  • C:\1
  • C:\2
  • C:\ \first
  • C:\ \second

For such C:\Temp folder with a subfolder with the name #255 (or more #255 symbols) the following code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

class Program
{
  public static string[] GetDirectories(string pathToTraverse)
  {
    List<string> result = new List<string>();

    foreach (DirectoryInfo subFolder in new DirectoryInfo(pathToTraverse).GetDirectories())
    {
      result.Add(subFolder.FullName);
    }
    return result.ToArray();
  }

  public static void TraverseFolders(string folderToTraverse)
  {
    foreach (string subFolder in GetDirectories(folderToTraverse))
    {
      Console.WriteLine(subFolder);

      TraverseFolders(subFolder);
    }
  }

  static void Main(string[] args)
  {
    TraverseFolders(@"C:\Temp");
  }
}

will never end and will give you result like:

C:\Temp\ 
C:\Temp\1
C:\Temp\2
C:\Temp\ 
C:\Temp\1
C:\Temp\2
C:\Temp\ 
C:\Temp\1
C:\Temp\2
C:\Temp\ 

So how do I correctly enumerate such folder subfolders?

回答1:

The following program runs perfectly and does not result in a stack overflow error.

using System;
using System.Text;
using System.IO;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            string pathToTraverse = @"C:\Desktop";
            foreach (DirectoryInfo subFolder in new DirectoryInfo(pathToTraverse).GetDirectories())
            {
                System.Console.WriteLine(subFolder);
            }
        }
    }
}

It produces the following output:

chaff
Python
__history
 
ÿ

The penultimate apparently blank line is in fact the directory named Alt+255.

Consequently I believe that your problem is not related to the code you have shown and is in fact elsewhere in some code that you have not presented to us.

I'm running on Windows 7 with VS 2010 Express targeting .net 3.5.


Now that your update shows all your code, I can see what is happening. The .net code is presumably trimming the directories and so the folders with white space get lost.

So @"C:\Temp\ " is trimmed to @"C:\Temp\".

I found the following trivial modification avoided the infinite loop:

TraverseFolders(subFolder+@"\");

Adding a trailing path separator stops the trimming that appears to occur in the call to DirectoryInfo. In the example above this means that @"C:\Temp\ \" is passed to DirectoryInfo which yields the expected results.

I guess you should probably use a routine that only adds a trailing path separator if one is not already present. And you may want to avoid hardcoding the @"\" as path separator, but that's for you to work out now that you know what the underlying cause of your problem is.



回答2:

ASCII character 255 is not fully supported by Windows. For visual sake it translates the this character into a "_" character.

The reason? ASCII 255 character displays as an invisible character but takes up one character space, therefore, a confusion between this character and the ASCII 32 SPACE character. This character only works on Windows 98 and lower versions including all DOS versions (if I'm not mistaken).

EDIT: Windows 7 now has some fix with some extended characters. The code should be handled fine if running on that OS.

The solution? Don't use this character as file and folder name because your program.

or

  1. Let the program check for extended characters and skip them if they exists before it loops indefinitely.

  2. Allow the program to folders with extended characters but if in case it loops continuously, place a code that will skip that folder and move to the next folder item.

  3. Your code should work with Vista and Windows 7, make that your program requirement.



回答3:

Why on earth do you have a folder called "_" anyway? It is not descriptive at all.. the idea of a folder is that you can contain all related files in that folder and possibly use subfolders to group them even more; folder names should generally be descriptive - for example many sites have a folder called "css" or "stylesheets".. Doubt I need to explain what they're for tho as they're pretty self-explanatory.. I, personally, cannot think of a single situation where I would be using 1 symbol for a folder name. It is best, in my opinion, to stick to alphanumeric characters and seldom use symbols (it is safer as you never hit these situations on any operating system).