multiple threads adding elements to one list. why

2019-02-06 10:12发布

问题:

The following code explains my question. I know the list is not thread safe. But what is the underlying "real" reason of this?

    class Program
{
    static void Main(string[] args)
    {
        List<string> strCol = new List<string>();

        for (int i = 0; i < 10; i++)
        {
            int id = i;
            Task.Factory.StartNew(() =>
            {
                AddElements(strCol);
            }).ContinueWith((t) => { WriteCount(strCol, id.ToString()); });
        }

        Console.ReadLine();
    }

    private static void WriteCount(List<string> strCol, string id)
    {
        Console.WriteLine(string.Format("Task {0} is done. Count: {1}. Thread ID: {2}", id, strCol.Count, Thread.CurrentThread.ManagedThreadId));
    }

    private static void AddElements(List<string> strCol)
    {
        for (int i = 0; i < 20000; i++)
        {
            strCol.Add(i.ToString());
        }
    }
}

回答1:

I will skip the obvious answer "List is not thread safe" - this you already know.

List items are kept in an internal array. There are at least two stages (from logical point of view) when adding an item to a List. First, List gets an index indicating where to put new item. It puts new item into array using this index. Then it increments the index and this is stage two. If second (or third, forth, ...) thread happens to add new item at the same time it may be that there will be two (3, 4, ...) new items put into the same array location before the index is incremented by the first thread. Items are overwritten and lost.

The internal operations of adding new item and incrementing index must be always done in one go for the list to be thread safe. That's what is called critical section. This can be achieved by locks.

Hope this explains a bit.



回答2:

This is because List<T> is not thread safe.

You should use a thread safe collection for this, such as one of the collections in System.Collections.Concurrent. Otherwise, you'll need to synchronize all access to the List<T> (ie: put every Add call within a lock), which will defeat the purpose of calling this using multiple threads entirely, as you're not doing any other work in this situation.