I'm running a program to benchmark how fast finding and iterating over all the files in a folder with large numbers of files. The slowest part of the process is creating the 1 million plus files. I'm using a pretty naive method to create the files at the moment:
Console.Write("Creating {0:N0} file(s) of size {1:N0} bytes... ",
options.FileCount, options.FileSize);
var createTimer = Stopwatch.StartNew();
var fileNames = new List<string>();
for (long i = 0; i < options.FileCount; i++)
var filename = Path.Combine(options.Directory.FullName,
CreateFilename(i, options.FileCount));
using (var file = new FileStream(filename, FileMode.CreateNew,
FileAccess.Write, FileShare.None, 4096,
// I have an option to write some data to files, but it's not being used.
// That's why there's a using here.
// Other code appears here.....
Console.WriteLine("Time to CreateFiles: {0:N3}sec ({1:N2} files/sec, 1 in {2:N4}ms)"
, createTimer.Elapsed.TotalSeconds
, (double)total / createTimer.Elapsed.TotalSeconds
, createTimer.Elapsed.TotalMilliseconds / (double)options.FileCount);
Creating 1,000,000 file(s) of size 0 bytes... Done.
Time to CreateFiles: 9,182.283sec (1,089.05 files/sec, 1 in 9.1823ms)
If there anything obviously better than this? I'm looking to test several orders of magnitude larger than 1 million, and it takes a day to create the files!
I havn't tried any sort of parallelism, trying to optimise any file system options or changing the order of file creation.
For completeness, here's the content of CreateFilename()
public static string CreateFilename(long i, long totalFiles)
if (totalFiles < 0)
throw new ArgumentOutOfRangeException("totalFiles",
totalFiles, "totalFiles must be positive");
// This tries to keep filenames to the 8.3 format as much as possible.
if (totalFiles < 99999999)
// No extension.
return String.Format("{0:00000000}", i);
else if (totalFiles >= 100000000 && totalFiles < 9999999999)
// Extend numbers into extension.
long rem = 0;
long div = Math.DivRem(i, 1000, out rem);
return String.Format("{0:00000000}", div) + "." +
String.Format("{0:000}", rem);
// Doesn't fit in 8.3, so just tostring the long.
return i.ToString();
Tried to parallelise as per StriplingWarrior's suggestion using Parallel.For()
. Results: about 30 threads thrashing my disk and a net slow down!
var fileNames = new ConcurrentBag<string>();
var opts = new ParallelOptions();
opts.MaxDegreeOfParallelism = 1; // 1 thread turns out to be fastest.
Parallel.For(0L, options.FileCount, opts,
() => new { Files = new List<string>() },
(i, parState, state) =>
var filename = Path.Combine(options.Directory.FullName,
CreateFilename(i, options.FileCount));
using (var file = new FileStream(filename, FileMode.CreateNew
, FileAccess.Write, FileShare.None
, 4096, FileOptions.WriteThrough))
return state;
state =>
foreach (var f in state.Files)
Found that changing the FileOptions
in the FileStream
improved perf by ~50%. Seems I was turning off any write cache.
new FileStream(filename, FileMode.CreateNew,
FileAccess.Write, FileShare.None,
4096, FileOptions.None)
Creating 10,000 file(s) of size 0 bytes... Done.
Time to CreateFiles: 12.390sec (8,071.05 files/sec, 1 in 1.2390ms)
Other ideas still welcome.