How do I know if this C# method is thread safe?

2019-01-16 06:20发布

问题:

I'm working on creating a call back function for an ASP.NET cache item removal event.

The documentation says I should call a method on an object or calls I know will exist (will be in scope), such as a static method, but it said I need to ensure the static is thread safe.

Part 1: What are some examples of things I could do to make it un-thread safe?

Part 2: Does this mean that if I have

static int addOne(int someNumber){
    int foo = someNumber;
    return foo +1; 
}

and I call Class.addOne(5); and Class.addOne(6); simutaneously, Might I get 6 or 7 returned depending on who which invocation sets foo first? (i.e. a race condition)

回答1:

That addOne function is indeed thread safe because it doesn't access any data that could be accessed by another thread. Local variables cannot be shared among threads because each thread gets its own stack. You do have to make sure, however, that the function parameters are value types and not reference types.

static void MyFunction(int x) { ... } // thread safe. The int is copied onto the local stack.

static void MyFunction(Object o) { ... } // Not thread safe. Since o is a reference type, it might be shared among multiple threads. 


回答2:

No, addOne is thread-safe here - it only uses local variables. Here's an example which wouldn't be thread-safe:

 class BadCounter
 {
       private static int counter;

       public static int Increment()
       {
             int temp = counter;
             temp++;
             counter = temp;
             return counter;
       }
 }

Here, two threads could both call Increment at the same time, and end up only incrementing once. (Using return ++counter; would be just as bad, by the way - the above is a more explicit version of the same thing. I expanded it so it would be more obviously wrong.)

The details of what is and isn't thread-safe can be quite tricky, but in general if you're not mutating any state (other than what was passed into you, anyway - bit of a grey area there) then it's usually okay.



回答3:

Threading issues (which I've also been worrying about lately) arise from the use of multiple processor cores with separate caches, as well as from basic thread-swapping race conditions. If the caches for separate cores access the same memory location, they will generally have no idea about the other one and may separately track the state of that data location without it going back out to main memory (or even to a synchronized cache shared across all cores at L2 or L3, for example), for processor performance reasons. So even order-of-execution interlock tricks may not be reliable in multi-threaded environments.

As you may know, the main tool to correct for this is a lock, which provides a mechanism for exclusive access (between contentions for the same lock) and handles the underlying cache synchronization so that accesses to the same memory location by various lock-protected code sections will be properly serialized. You can still have race conditions between who gets the lock when and in what order, but that's usually much simpler to deal with when you can guarantee that execution of a locked section is atomic (within the context of that lock).

You can get a lock on an instance of any reference type (eg. inherits from Object, not value types like int or enums, and not null), but it's very important to understand that the lock on an object has no inherent effect on accesses to that object, it only interacts with other attempts to get the lock on the same object. It is up to the class to protect access to its member variables using an appropriate locking scheme. Sometimes instances might protect multi-threaded accesses to their own members by locking on themselves (eg. lock (this) { ... } ), but usually this is not necessary because instances tend to be held by only one owner and don't need to guarantee threadsafe access to the instance.

More commonly, a class creates a private lock (eg. private readonly object m_Lock = new Object(); for separate locks within each instance to protect access to members of that instance, or private static readonly object s_Lock = new Object(); for a central lock to protect access to the class's static members). Josh has a more specific code example of using a lock. You then have to code the class to use the lock appropriately. In more complex cases you might even want to create separate locks for different groups of members, to reduce contention for different kinds of resources which aren't used together.

So, to bring it back to your original question, a method which only accesses its own local variables and parameters would be thread-safe, because these exist in their own memory locations on the stack specific to the current thread, and can not be accessed elsewhere--unless you shared those parameter instances across threads before passing them.

A non-static method which only accesses the instances own members (no static members)--and of course parameters and local variables--would not need to use locks in the context of that instance being used by a single owner (doesn't need to be thread-safe), but if instances were intended to be shared and wanted to guarantee thread-safe access, then the instance would need to protect access to its member variables with one or more locks specific to that instance (locking on the instance itself being one option)--as opposed to leaving it up to the caller to implement their own locks around it when sharing something not intended to be thread-safe shareable.

Access to readonly members (static or non-static) which aren't ever manipulated is generally safe, but if the instance it holds is not itself thread-safe or if you need to guarantee atomicity across multiple manipulations of it, then you may need to protect all access to it with your own locking scheme as well. That's a case where it could be handy if the instance uses locking on itself, because you could simply get a lock on the instance across multiple accesses into it for atomicity, but you wouldn't need to do so for single accesses into it if it's using a lock on itself to make those accesses individually thread-safe. (If it's not your class, you'd have to know whether it locks on itself or is using a private lock which you can't access externally.)

And finally, there's access to changing static members (changed by the given method or by any others) from within an instance--and of course static methods which access those static members and could be called from anyone, anywhere, anytime--which have the biggest need to use responsible locking, without which are definitely not thread-safe and are likely to cause unpredictable bugs.

When dealing with .NET framework classes, Microsoft documents in MSDN whether a given API call is thread-safe (eg. static methods of the provided generic collection types like List<T> are made thread-safe while instance methods may not be--but check specifically to be sure). The vast majority of the time (and unless it specifically says it's thread-safe), it's not internally thread-safe, so it's your responsibility to use it in a safe manner. And even when individual operations are implemented internally thread-safe, you still have to worry about shared and overlapping access by your code if it does anything more complex which needs to be atomic.

One big caveat is iterating over a collection (eg. with foreach). Even if each access to the collection gets a stable state there's no inherent guarantee that it won't change in between those accesses (if anywhere else can get to it). When the collection is held locally there's generally no problem, but a collection which could be changed (by another thread or during your loop's execution!) could produce inconsistent results. One easy way to solve this is to use an atomic thread-safe operation (inside your protective locking scheme) to make a temporary copy of the collection (MyType[] mySnapshot = myCollection.ToArray();) and then iterate over that local snapshot copy outside the lock. In many cases this avoids the need for holding a lock the whole time, but depending on what you're doing within the iteration this may not be enough and you just have to protect against changes the whole time (or you may already have it inside a locked section guarding against access to change the collection along with other things, so it's covered).

So, there's a bit of an art to thread-safe design, and knowing just where and how to get locks to protect things depends a lot on the overall design and usage of your class(es). It can be easy to get paranoid and think you have to get locks all over for everything, but really it's about finding the right layer at which to protect things.



回答4:

Your method is fine since it is only using local variables, let's change your method a bit:

static int foo;

static int addOne(int someNumber)
{
  foo=someNumber; 
  return foo++;
}

This is not a thread safe method because we are touching static data. This would then need to be modified to be:

static int foo;
static object addOneLocker=new object();
static int addOne(int someNumber)
{
  int myCalc;
  lock(addOneLocker)
  {
     foo=someNumber; 
     myCalc= foo++;
  }
  return myCalc;
}

Which I think this is a silly sample I just did cause if I'm reading it correctly there is no point in foo anymore but hey it's a sample.



回答5:

There is some research going on which allows you to detect non-thread-safe code. E.g. the project CHESS at Microsoft Research.



回答6:

This would only be a race condition if it were modifying some variable external to the function. Your example is not doing that.

That's basically what you're looking out for. Thread safe means that the function either:

  1. Does not modify external data, or
  2. Access to external data is properly synchronized so that only one function can access it at any one time.

External data could be something held in storage (database/file), or something internal to the application (a variable, an instance of a class, etc): basically anything that is declared anywhere in the world that is outside of the function's scope.

A trivial example of an un-thread safe version of your function would be this:

private int myVar = 0;

private void addOne(int someNumber)
{
   myVar += someNumber;
}

If you call this from two different threads without synchronization, querying the value of myVar will be different depending on whether the query happens after all calls to addOne are complete, or the query happens in between the two calls, or the query happens before either of the calls.



回答7:

In the above example no.

Thread safety is mainly to do with stored state. You can make the above example non thread safe by doing this:

static int myInt;

static int addOne(int someNumber){
myInt = someNumber;
return myInt +1; 
}

This will mean that due to context switching thread 1 might get to the call myInt = someNumber and then context switch, lets say thread 1 just set it to 5. Then imagine that thread 2 comes in and uses 6 and returns 7. Then when thread 1 wakes up again it will have 6 in myInt instead of the 5 that it was using and return 7 instead of the expected 6. :O



回答8:

Anywhere, thread safe means that you don't have two or more threads colliding when you are accessing a resource. Usually static varaiables --- in languages like C#, VB.NET and Java --- made your code thread unsafe.

In Java exists the synchronized keyword. But in .NET you get the assembly option/directive:


class Foo
{
    [MethodImpl(MethodImplOptions.Synchronized)]
    public void Bar(object obj)
    {
        // do something...
    }
}

Examples of non thread safe classes should be singletons, depending on how is this pattern coded. Usually it must implement a synchronized instance creator.

If you don't want a synchronized method, you can try locking method, such as spin-lock.



回答9:

any access to an obect that can be used simultaneously by two threads is not threadsafe.

your example in Part 2 is clearly safe, as it uses only values passed in as arguments, but if you used an object scoped variable you might have to surround the access with appropriate lock statements



回答10:

foo is not shared between concurrent or sequential invocations, so addOne is thread-safe.



回答11:

The reason that 'foo' and 'someNumber' are safe in your example is that they reside on the stack, and each thread has their own stack and so are not shared.

As soon as data has the potential to be shared, for example, being global or sharing pointers to objects, then you could have conflicts and may need to use locks of some sort.