I need to setup an application that watches for files being created in a directory, both locally or on a network drive.
Would the FileSystemWatcher
or polling on a timer would be the best option. I have used both methods in the past, but not extensively.
What issues (performance, reliability etc.) are there with either method?
Also note that file system watcher is not reliable on file shares. Particularly if the file share is hosted on a non-windows server. FSW should not be used for anything critical. Or should be used with an occasional poll to verify that it hasn't missed anything.
The biggest problem I have had is missing files when the buffer gets full. Easy as pie to fix--just increase the buffer. Remember that it contains the file names and events, so increase it to the expected amount of files (trial and error). It does use memory that cannot be paged out, so it could force other processes to page if memory gets low.
Here is the MSDN article on buffer : FileSystemWatcher..::.InternalBufferSize Property
Per MSDN:
We use 16MB due to a large batch expected at one time. Works fine and never misses a file.
We also read all the files before beginning to process even one...get the file names safely cached away (in our case, into a database table) then process them.
For file locking issues I spawn a process which waits around for the file to be unlocked waiting one second, then two, then four, et cetera. We never poll. This has been in production without error for about two years.
Returning from the event method as quickly as possible, using another thread, solved the problem for me:
I have seen the file system watcher fail in production and test environments. I now consider it a convenience, but I do not consider it reliable. My pattern has been to watch for changes with the files system watcher, but poll occasionally to catch missing file changes.
Edit: If you have a UI, you can also give your user the ability to "refresh" for changes instead of polling. I would combine this with a file system watcher.
Using both FSW and polling is a waste of time and resources, in my opinion, and I am surprised that experienced developers suggest it. If you need to use polling to check for any "FSW misses", then you can, naturally, discard FSW altogether and use only polling.
I am, currently, trying to decide whether I will use FSW or polling for a project I develop. Reading the answers, it is obvious that there are cases where FSW covers the needs perfectly, while other times, you need polling. Unfortunately, no answer has actually dealt with the performance difference(if there is any), only with the "reliability" issues. Is there anyone that can answer that part of the question?
EDIT : nmclean's point for the validity of using both FSW and polling(you can read the discussion in the comments, if you are interested) appears to be a very rational explanation why there can be situations that using both an FSW and polling is efficient. Thank you for shedding light on that for me(and anyone else having the same opinion), nmclean.
The
FileSystemWatcher
may also miss changes during busy times, if the number of queued changes overflows the buffer provided. This is not a limitation of the .NET class per se, but of the underlying Win32 infrastructure. In our experience, the best way to minimize this problem is to dequeue the notifications as quickly as possible and deal with them on another thread.As mentioned by @ChillTemp above, the watcher may not work on non-Windows shares. For example, it will not work at all on mounted Novell drives.
I agree that a good compromise is to do an occasional poll to pick up any missed changes.