Best practise for using Directory.GetFiles() or En

2019-08-05 04:29发布

问题:

Currently I try to improve the design of two windows services (C#). Service A produces data exports (csv files) and writes them to a temporary directory. So the file is written to a temporary directory that is a sub dir. of the main output directory. Then the file is moved (via File.Move) to the output directory (after a successful write). This export may be performed by multiple threads.

Another service B tries to fetch the files from this output directory in a defined interval. How to assure that Directory.GetFiles() excludes locked files.

  1. Should I try to check every file by creating a new FileStream (using (Stream stream = new FileStream("MyFilename.txt", FileMode.Open)) as described here.

  2. Or should the producer service (A) use temporary file names (*.csv.tmp) that are automatically excluded by the consumer serivce (B) with appropriate search pattterns. And rename a file after the move was finished.

  3. Are there better ways to handle such file listing operations.

回答1:

Don't bother checking!

Huh? How can that be?

If the files are on the same drive, a Move operation is atomic! The operation is effectively a rename, erasing the directory entry from the previous, and inserting it into the next directory, pointing to the same sectors (or whatevers) where the data really are, without rewriting it. The file system's internal locking mechanism has to lock & block directory reads during this process to prevent a directory scan from returning corrupt results.

That means, by the time it ever shows up in a directory, it won't be locked; in fact, the file won't have been opened/modified since the close operation that wrote it to the previous directory.

caveats - (1) definitely won't work between drives, partitions, or other media mounted as a subdirectory. The OS does a copy+delete behind the scenes instead of a directory entry edit. (2) this behaviour is a convention, not a rule. Though I've never seen it, file systems are free to break it, and even to break it inconsistently!

So this will probably work. If it doesn't, I'd recommend using your own idea of temp extensions (I've done it before for this exact purpose, but between a client and server that only could talk by communicating via a shared drive) and it's not that hard and worked flawlessly.

If your own idea is too low-tech, and you're on the same machine (sounds like you are), you can set a mutex (google that), with the filename embedded, that lives while the file is being written, in the writer process; then do a blocking test on it when you open each file you are reading from the other process. If you want the second process to respond ASAP combine this with the filesystem watcher. Then pat yourself on the back for spending ten times the effort as the temp filename idea, with no extra gain >:-}

good luck!



回答2:

One way would be to mark the files as temporary from the writing app whilst they're in use, and only unmark them once they are written to and closed, eg.

FileStream f = File.Create (filename);
FileAttributes attr = File.GetAttributes (filename);
File.SetAttributes (filename, attr | FileAttributes.Temporary);
 //write to file.
f.Close ();

File.SetAttributes (filename, attr);

From the consuming app, you just want to skip any temporary files.

foreach (var file in Directory.GetFiles (Path.GetDirectoryName (filename))) {
    if ((File.GetAttributes (file) & FileAttributes.Temporary) != 0) continue;
    // do normal stuff.
}