I have a directory with around 15-30 thousand files. I need to just pull the oldest one. In other words the one that was created first. Is there a quick way to do this using C#, other than loading them into a collection then sorting?
问题:
回答1:
The short answer is no. Windows file systems don't index files by date so there is no native way to do this, let alone a .net way without enumerating all of them.
回答2:
You will have to load the FileInfo objects into a collection & sort, but it's a one-liner:
FileSystemInfo fileInfo = new DirectoryInfo(directoryPath).GetFileSystemInfos()
.OrderBy(fi => fi.CreationTime).First();
Ok, two lines because it's a long statement.
回答3:
If you control the directory (that is, if your programs are responsible for creating and maintaining all files in that directory), then you should consider tracking the metadata about each file separately; perhaps in a database.
In fact, the FileStream column type in SQL Server 2008 can help with this. You can create a table that contains columns for filename, create date, modify date, and a FileStream column for the content. You can find things like the oldest file by using indexes on the metadata columns. You can find the content by using the FileStream column.
回答4:
You can't do it without sorting but what you can do is make it fast.
Sorting by CreationTime
can be slow because first accessing this property for each file involves interrogation of the file system.
Use A Faster Directory Enumerator that preserves more information about files while enumerating and allows to do sorting faster.
Code to compare performance:
static void Main(string[] args)
{
var timer = Stopwatch.StartNew();
var oldestFile = FastDirectoryEnumerator.EnumerateFiles(@"c:\windows\system32")
.OrderBy(f => f.CreationTime).First();
timer.Stop();
Console.WriteLine(oldestFile);
Console.WriteLine("FastDirectoryEnumerator - {0}ms", timer.ElapsedMilliseconds);
Console.WriteLine();
timer.Reset();
timer.Start();
var oldestFile2 = new DirectoryInfo(@"c:\windows\system32").GetFiles()
.OrderBy(f => f.CreationTime).First();
timer.Stop();
Console.WriteLine(oldestFile2);
Console.WriteLine("DirectoryInfo - {0}ms", timer.ElapsedMilliseconds);
Console.WriteLine("Press ENTER to finish");
Console.ReadLine();
}
For me it gives this:
VEN2232.OLB
FastDirectoryEnumerator - 27ms
VEN2232.OLB
DirectoryInfo - 559ms
回答5:
Edit: Removed the sort and made it a function.
public static FileInfo GetOldestFile(string directory)
{
if (!Directory.Exists(directory))
throw new ArgumentException();
DirectoryInfo parent = new DirectoryInfo(directory);
FileInfo[] children = parent.GetFiles();
if (children.Length == 0)
return null;
FileInfo oldest = children[0];
foreach (var child in children.Skip(1))
{
if (child.CreationTime < oldest.CreationTime)
oldest = child;
}
return oldest;
}
回答6:
Sorting is O(n log n)
. Instead, why don't you just enumerate the directory? I'm not sure what the C# equivalent of FindFirstFile()
/FindNextFile()
is, but you want to do is:
Keep the current lowest date and filename in a local variable.
Enumerate the directory.
- If the date on a given file is less than the local variable, set the local variable to the new date and filename.
回答7:
Oddly enough, this worked perfectly on a directory of mine with 3000+ jpg files:
DirectoryInfo di = new DirectoryInfo(dpath);
FileInfo[] rgFiles = di.GetFiles("*.jpg");
FileInfo firstfile = rgFiles[0];
FileInfo lastfile = rgFiles[rgFiles.Length - 1];
DateTime oldestfiletime = firstfile.CreationTime;
DateTime newestfiletime = lastfile.CreationTime;
回答8:
Here's a C# routine that may do what you want by spawning a cmd shell execute a dir /o:D
on the specified directory and returning the name of the first file found.
static string GetOldestFile(string dirName)
{
ProcessStartInfo si = new ProcessStartInfo("cmd.exe");
si.RedirectStandardInput = true;
si.RedirectStandardOutput = true;
si.UseShellExecute = false;
Process p = Process.Start(si);
p.StandardInput.WriteLine(@"dir " + dirName + " /o:D");
p.StandardInput.WriteLine(@"exit");
string output = p.StandardOutput.ReadToEnd();
string[] splitters = { Environment.NewLine };
string[] lines = output.Split(splitters, StringSplitOptions.RemoveEmptyEntries);
// find first line with a valid date that does not have a <DIR> in it
DateTime result;
int i = 0;
while (i < lines.Length)
{
string[] tokens = lines[i].Split(' ');
if (DateTime.TryParse(tokens[0], out result))
{
if (!lines[i].Contains("<DIR>"))
{
return tokens[tokens.Length - 1];
}
}
i++;
}
return "";
}
回答9:
Look, would it not be easier to shell out to a hidden process and redirect the output stream to the input and use the dir /o-d
which sorts by the date/time, using the dash reverses the operation....
Edit: here's a sample code to do this...quick and dirty...
public class TestDir { private StringBuilder sbRedirectedOutput = new StringBuilder(); public string OutputData { get { return this.sbRedirectedOutput.ToString(); } } public void Run() { System.Diagnostics.ProcessStartInfo ps = new System.Diagnostics.ProcessStartInfo(); ps.FileName = "cmd"; ps.ErrorDialog = false; ps.Arguments = string.Format("dir {0} /o-d", path_name); ps.CreateNoWindow = true; ps.UseShellExecute = false; ps.RedirectStandardOutput = true; ps.WindowStyle = System.Diagnostics.ProcessWindowStyle.Hidden; using (System.Diagnostics.Process proc = new System.Diagnostics.Process()) { proc.StartInfo = ps; proc.Exited += new EventHandler(proc_Exited); proc.OutputDataReceived += new System.Diagnostics.DataReceivedEventHandler(proc_OutputDataReceived); proc.Start(); proc.WaitForExit(); proc.BeginOutputReadLine(); while (!proc.HasExited) ; } } void proc_Exited(object sender, EventArgs e) { System.Diagnostics.Debug.WriteLine("proc_Exited: Process Ended"); } void proc_OutputDataReceived(object sender, System.Diagnostics.DataReceivedEventArgs e) { if (e.Data != null) this.sbRedirectedOutput.Append(e.Data + Environment.NewLine); //System.Diagnostics.Debug.WriteLine("proc_OutputDataReceived: Data: " + e.Data); } }
The very first 4 or 5 lines of the StringBuilder object sbRedirectedOutput
can be chopped out,then after that line would contain the oldest filename and would be quite easy to parse out....