I am trying to get an overview of the thread safety theory behind the collections in C#.
Why are there no concurrent collections as there are in Java? (java docs). Some collections appear thread safe but it is not clear to me what the position is for example with regard to:
- compound operations,
- safety of using iterators,
- write operations
I do not want to reinvent the wheel! (I am not a multi-threading guru and am definitely not underestimating how hard this would be anyway).
I hope the community can help.
.NET has had relatively "low level" concurrency support until now - but .NET 4.0 introduces the System.Collections.Concurrent
namespace which contains various collections which are safe and useful.
Andrew's answer is entirely correct in terms of how to deal with collections before .NET 4.0 of course - and for most uses I'd just lock appropriately when accessing a "normal" shared collection. The concurrent collections, however, make it easy to use a producer/consumer queue, etc.
C# offers several ways to work with collections across multiple threads. For a good write-up of these techniques I would recommend that you start with Collections and Synchronization (Thread Safety):
By default, Collections classes are
generally not thread safe. Multiple
readers can read the collection with
confidence; however, any modification
to the collection produces undefined
results for all threads that access
the collection, including the reader
threads.
Collections classes can be made thread
safe using any of the following
methods:
- Create a thread-safe wrapper using the Synchronized method, and
access the collection exclusively
through that wrapper.
- If the class does not have a Synchronized method, derive from the
class and implement a Synchronized
method using the SyncRoot property.
- Use a locking mechanism, such as the lock statement in C# (SyncLock in
Visual Basic), on the SyncRoot
property when accessing the
collection.
As Jon Skeet mentioned, there are now "thread safe" collections in the System.Collections.Concurrent namespace in .NET 4.
One of the reason that no concurrent collections exist (at least my guess) in prior .NET Framework versions is that it is very hard to guarantee thread safety, even with a concurrent collection.
(This is not entirely true as some collections offer a Synchronized method to return a thread safe collection from a non-thread safe collection so there are some thread safe collections...)
For example assume one has a thread safe Dictionary - if one only want to to an insert if the Key does not exist one would first query the collection to see if the Key exists, then one would do an insert if the key does not exist. These two operation are not thread safe though, between the query of ContainsKey and the Add operation another thread could have done an insert of that key so there is a race condition.
Inother words the operations of the collection are thread safe - but the usage of it is not necessarily. In this case one would need to transition back to traditional locking techniques (mutex/monitor/semaphore...) to achieve thread safety so the concurrent collection has bought you nothing in terms of multi-threaded safety (but is probably worse for performance).