File paths with non-ascii characters and FileInfo

2019-07-15 16:25发布

问题:

I get a string that more or less looks like this:

"C:\\bláh\\bleh"

I make a FileInfo with it, but when I check for its existence it returns false:

var file = new FileInfo(path);
file.Exists;

If I manually rename the path to

"C:\\blah\\bleh"

at debug time and ensure that blah exists with a bleh inside it, then file.Exists starts returning true. So I believe the problem is the non-ascii character.

The actual string is built by my program. One part comes from the AppDomain of the application, which is the part that contains the "á", the other part comes, in a way, from the user. Both parts are put together by Path.Combine. I confirmed the validity of the resulting string in two ways: copying it from the error my program generates, which includes the path, into explorer opens the file just fine. Looking at that string at the debugger, it looks correctly escaped, in that \ are written as \. The "á" is printed literarily by the debugger.

How should I process a string so that even if it has non-ascii characters it turns out to be a valid path?

回答1:

Here is a method that will handle diacritics in filenames. The success of the File.Exists method depends on how your system stores the filename.

public bool FileExists(string sPath)
{
  //Checking for composed and decomposed is to handle diacritics in filenames.  
  var pathComposed = sPath.Normalize(NormalizationForm.FormC);
  if (File.Exists(pathComposed))    
      return true;

   //We really need to check both possibilities.
   var pathDecomposed = sPath.Normalize(NormalizationForm.FormD);
   if (File.Exists(pathDecomposed))     
      return true;

   return false;
}


回答2:

try this

    string sourceFile = @"C:\bláh\bleh";
    if (File.Exists(sourceFile))
    {

         Console.WriteLine("file exist.");

    }
    else
    {
        Console.WriteLine("file does not exist.");

    }

Note : The Exists method should not be used for path validation, this method merely checks if the file specified in path exists. Passing an invalid path to Exists returns false.

For path validation you can use Directory.Exists.



回答3:

I have just manuall created a bláh folder containing a bleh file, and with that in place, this code prints True as expected:

using System;
using System.IO;

namespace ConsoleApplication72
{
    class Program
    {
        static void Main(string[] args)
        {
            string filename = "c:\\bláh\\bleh";

            FileInfo fi = new FileInfo(filename);

            Console.WriteLine(fi.Exists);

            Console.ReadLine();
        }
    }
}

I would suggest checking the source of your string - in particular, although your 3k rep speaks against this being the problem, remember that expressing a backslash as \\ is an artifact of C# syntax, and you want to make sure your string actually contains only single \s.



回答4:

Referring to @adatapost's reply, the list of invalid file name characters (gleaned from System.IO.Path.GetInvalidFileNameChars() in fact doesn't contain normal characters with diacritics.

It looks like the question you're really asking is, "How do I remove diacritics from a string (or in this case, file path)?".

Or maybe you aren't asking this question, and you genuinely want to find a file with name:

c:\blòh\bleh

(or something similar). In that case, you then need to try to open a file with the same name, and not c:\bloh\bleh.



回答5:

Look like the "bleh" in the path is a directory, not a file. To check if the folder exist use Directory.Exists method.



回答6:

The problem was: the program didn't have enough permissions to access that file. Fixing the permissions fixed the problem. It seems that when I didn't my experiment I somehow managed to reproduce the permission problem, possibly by creating the folder without the non-ascii character by hand and copying the other one.

Oh... so embarrassing.