I'm curious what exactly the behavior is on the following:
FileInfo info = new FileInfo("C:/testfile.txt.gz");
string ext = info.Extension;
Will this return ".txt.gz" or ".gz"?
What is the behavior with even more extensions, such as ".txt.gz.zip" or something like that?
EDIT:
To be clear, I've already tested this. I would like an explanation of the property.
It will return .gz, but the explanation from MSDN (FileSystemInfo.Extension Property) isn't clear why:
"The Extension property returns the FileSystemInfo extension, including the period (.). For example, for a file c:\NewFile.txt, this property returns ".txt"."
So I looked up the code of the Extension
property with reflector:
public string Extension
{
get
{
int length = this.FullPath.Length;
int startIndex = length;
while (--startIndex >= 0)
{
char ch = this.FullPath[startIndex];
if (ch == '.')
{
return this.FullPath.Substring(startIndex, length - startIndex);
}
if (((ch == Path.DirectorySeparatorChar) || (ch == Path.AltDirectorySeparatorChar)) || (ch == Path.VolumeSeparatorChar))
{
break;
}
}
return string.Empty;
}
}
It's check every char from the end of the filepath till it finds a dot, then a substring is returned from the dot to the end of the filepath.
[TestCase(@"C:/testfile.txt.gz", ".gz")]
[TestCase(@"C:/testfile.txt.gz.zip", ".zip")]
[TestCase(@"C:/testfile.txt.gz.SO.jpg", ".jpg")]
public void TestName(string fileName, string expected)
{
FileInfo info = new FileInfo(fileName);
string actual = info.Extension;
Assert.AreEqual(actual, expected);
}
All pass
It returns the extension from the last dot, because it can't guess whether another part of the filename is part of the extension. In the case of testfile.txt.gz
, you could argue that the extension is .txt.gz
, but what about System.Data.dll
? Should the extension be .Data.dll
? Probably not... There's no way to guess, so the Extension
property doesn't try to.
The file extension starts at the last dot. Unfortunately, the documentation for FileSystemInfo.Extension doesn't answer that, but it logically must return the same value as Path.GetExtension, for which the documentation states:
Remarks
The extension of path is obtained by searching path for a period (.), starting with the last character in path and continuing toward the start of path. If a period is found before a DirectorySeparatorChar or AltDirectorySeparatorChar character, the returned string contains the period and the characters after it; otherwise, Empty is returned.
For a list of common I/O tasks, see Common I/O Tasks.
It would be nice there is an authoritative answer on file names in general, but I'm having trouble finding it.