How should I unit test threaded code?

2018-12-31 07:04发布

I have thus far avoided the nightmare that is testing multi-threaded code since it just seems like too much of a minefield. I'd like to ask how people have gone about testing code that relies on threads for successful execution, or just how people have gone about testing those kinds of issues that only show up when two threads interact in a given manner?

This seems like a really key problem for programmers today, it would be useful to pool our knowledge on this one imho.

24条回答
浅入江南
2楼-- · 2018-12-31 07:48

Concurrency is a complex interplay between the memory model, hardware, caches and our code. In the case of Java at least such tests have been partly addressed mainly by jcstress. The creators of that library are known to be authors of many JVM, GC and Java concurrency features.

But even this library needs good knowledge of the Java Memory Model specification so that we know exactly what we are testing. But I think the focus of this effort is mircobenchmarks. Not huge business applications.

查看更多
不流泪的眼
3楼-- · 2018-12-31 07:49

I also had serious problems testing multi- threaded code. Then I found a really cool solution in "xUnit Test Patterns" by Gerard Meszaros. The pattern he describes is called Humble object.

Basically it describes how you can extract the logic into a separate, easy-to-test component that is decoupled from its environment. After you tested this logic, you can test the complicated behaviour (multi- threading, asynchronous execution, etc...)

查看更多
残风、尘缘若梦
4楼-- · 2018-12-31 07:49

Testing MT code for correctness is, as already stated, quite a hard problem. In the end it boils down to ensuring that there are no incorrectly synchronised data races in your code. The problem with this is that there are infinitely many possibilities of thread execution (interleavings) over which you do not have much control (be sure to read this article, though). In simple scenarios it might be possible to actually prove correctness by reasoning but this is usually not the case. Especially if you want to avoid/minimize synchronization and not go for the most obvious/easiest synchronization option.

An approach that I follow is to write highly concurrent test code in order to make potentially undetected data races likely to occur. And then I run those tests for some time :) I once stumbled upon a talk where some computer scientist where showing off a tool that kind of does this (randomly devising test from specs and then running them wildly, concurrently, checking for the defined invariants to be broken).

By the way, I think this aspect of testing MT code has not been mentioned here: identify invariants of the code that you can check for randomly. Unfortunately, finding those invariants is quite a hard problem, too. Also they might not hold all the time during execution, so you have to find/enforce executions points where you can expect them to be true. Bringing the code execution to such a state is also a hard problem (and might itself incur concurrency issues. Whew, it's damn hard!

Some interesting links to read:

查看更多
路过你的时光
5楼-- · 2018-12-31 07:50

There are a few tools around that are quite good. Here is a summary of some of the Java ones.

Some good static analysis tools include FindBugs (gives some useful hints), JLint, Java Pathfinder (JPF & JPF2), and Bogor.

MultithreadedTC is quite a good dynamic analysis tool (integrated into JUnit) where you have to set up your own test cases.

ConTest from IBM Research is interesting. It instruments your code by inserting all kinds of thread modifying behaviours (e.g. sleep & yield) to try to uncover bugs randomly.

SPIN is a really cool tool for modelling your Java (and other) components, but you need to have some useful framework. It is hard to use as is, but extremely powerful if you know how to use it. Quite a few tools use SPIN underneath the hood.

MultithreadedTC is probably the most mainstream, but some of the static analysis tools listed above are definitely worth looking at.

查看更多
余欢
6楼-- · 2018-12-31 07:50

I've done a lot of this, and yes it sucks.

Some tips:

  • GroboUtils for running multiple test threads
  • alphaWorks ConTest to instrument classes to cause interleavings to vary between iterations
  • Create a throwable field and check it in tearDown (see Listing 1). If you catch a bad exception in another thread, just assign it to throwable.
  • I created the utils class in Listing 2 and have found it invaluable, especially waitForVerify and waitForCondition, which will greatly increase the performance of your tests.
  • Make good use of AtomicBoolean in your tests. It is thread safe, and you'll often need a final reference type to store values from callback classes and suchlike. See example in Listing 3.
  • Make sure to always give your test a timeout (e.g., @Test(timeout=60*1000)), as concurrency tests can sometimes hang forever when they're broken

Listing 1:

@After
public void tearDown() {
    if ( throwable != null )
        throw throwable;
}

Listing 2:

import static org.junit.Assert.fail;
import java.io.File;
import java.lang.reflect.InvocationHandler;
import java.lang.reflect.Proxy;
import java.util.Random;
import org.apache.commons.collections.Closure;
import org.apache.commons.collections.Predicate;
import org.apache.commons.lang.time.StopWatch;
import org.easymock.EasyMock;
import org.easymock.classextension.internal.ClassExtensionHelper;
import static org.easymock.classextension.EasyMock.*;

import ca.digitalrapids.io.DRFileUtils;

/**
 * Various utilities for testing
 */
public abstract class DRTestUtils
{
    static private Random random = new Random();

/** Calls {@link #waitForCondition(Integer, Integer, Predicate, String)} with
 * default max wait and check period values.
 */
static public void waitForCondition(Predicate predicate, String errorMessage) 
    throws Throwable
{
    waitForCondition(null, null, predicate, errorMessage);
}

/** Blocks until a condition is true, throwing an {@link AssertionError} if
 * it does not become true during a given max time.
 * @param maxWait_ms max time to wait for true condition. Optional; defaults
 * to 30 * 1000 ms (30 seconds).
 * @param checkPeriod_ms period at which to try the condition. Optional; defaults
 * to 100 ms.
 * @param predicate the condition
 * @param errorMessage message use in the {@link AssertionError}
 * @throws Throwable on {@link AssertionError} or any other exception/error
 */
static public void waitForCondition(Integer maxWait_ms, Integer checkPeriod_ms, 
    Predicate predicate, String errorMessage) throws Throwable 
{
    waitForCondition(maxWait_ms, checkPeriod_ms, predicate, new Closure() {
        public void execute(Object errorMessage)
        {
            fail((String)errorMessage);
        }
    }, errorMessage);
}

/** Blocks until a condition is true, running a closure if
 * it does not become true during a given max time.
 * @param maxWait_ms max time to wait for true condition. Optional; defaults
 * to 30 * 1000 ms (30 seconds).
 * @param checkPeriod_ms period at which to try the condition. Optional; defaults
 * to 100 ms.
 * @param predicate the condition
 * @param closure closure to run
 * @param argument argument for closure
 * @throws Throwable on {@link AssertionError} or any other exception/error
 */
static public void waitForCondition(Integer maxWait_ms, Integer checkPeriod_ms, 
    Predicate predicate, Closure closure, Object argument) throws Throwable 
{
    if ( maxWait_ms == null )
        maxWait_ms = 30 * 1000;
    if ( checkPeriod_ms == null )
        checkPeriod_ms = 100;
    StopWatch stopWatch = new StopWatch();
    stopWatch.start();
    while ( !predicate.evaluate(null) ) {
        Thread.sleep(checkPeriod_ms);
        if ( stopWatch.getTime() > maxWait_ms ) {
            closure.execute(argument);
        }
    }
}

/** Calls {@link #waitForVerify(Integer, Object)} with <code>null</code>
 * for {@code maxWait_ms}
 */
static public void waitForVerify(Object easyMockProxy)
    throws Throwable
{
    waitForVerify(null, easyMockProxy);
}

/** Repeatedly calls {@link EasyMock#verify(Object[])} until it succeeds, or a
 * max wait time has elapsed.
 * @param maxWait_ms Max wait time. <code>null</code> defaults to 30s.
 * @param easyMockProxy Proxy to call verify on
 * @throws Throwable
 */
static public void waitForVerify(Integer maxWait_ms, Object easyMockProxy)
    throws Throwable
{
    if ( maxWait_ms == null )
        maxWait_ms = 30 * 1000;
    StopWatch stopWatch = new StopWatch();
    stopWatch.start();
    for(;;) {
        try
        {
            verify(easyMockProxy);
            break;
        }
        catch (AssertionError e)
        {
            if ( stopWatch.getTime() > maxWait_ms )
                throw e;
            Thread.sleep(100);
        }
    }
}

/** Returns a path to a directory in the temp dir with the name of the given
 * class. This is useful for temporary test files.
 * @param aClass test class for which to create dir
 * @return the path
 */
static public String getTestDirPathForTestClass(Object object) 
{

    String filename = object instanceof Class ? 
        ((Class)object).getName() :
        object.getClass().getName();
    return DRFileUtils.getTempDir() + File.separator + 
        filename;
}

static public byte[] createRandomByteArray(int bytesLength)
{
    byte[] sourceBytes = new byte[bytesLength];
    random.nextBytes(sourceBytes);
    return sourceBytes;
}

/** Returns <code>true</code> if the given object is an EasyMock mock object 
 */
static public boolean isEasyMockMock(Object object) {
    try {
        InvocationHandler invocationHandler = Proxy
                .getInvocationHandler(object);
        return invocationHandler.getClass().getName().contains("easymock");
    } catch (IllegalArgumentException e) {
        return false;
    }
}
}

Listing 3:

@Test
public void testSomething() {
    final AtomicBoolean called = new AtomicBoolean(false);
    subject.setCallback(new SomeCallback() {
        public void callback(Object arg) {
            // check arg here
            called.set(true);
        }
    });
    subject.run();
    assertTrue(called.get());
}
查看更多
零度萤火
7楼-- · 2018-12-31 07:51

I have faced this issue several times in recent years when writing thread handling code for several projects. I'm providing a late answer because most of the other answers, while providing alternatives, do not actually answer the question about testing. My answer is addressed to the cases where there is no alternative to multithreaded code; I do cover code design issues for completeness, but also discuss unit testing.

Writing testable multithreaded code

The first thing to do is to separate your production thread handling code from all the code that does actual data processing. That way, the data processing can be tested as singly threaded code, and the only thing the multithreaded code does is to coordinate threads.

The second thing to remember is that bugs in multithreaded code are probabilistic; the bugs that manifest themselves least frequently are the bugs that will sneak through into production, will be difficult to reproduce even in production, and will thus cause the biggest problems. For this reason, the standard coding approach of writing the code quickly and then debugging it until it works is a bad idea for multithreaded code; it will result in code where the easy bugs are fixed and the dangerous bugs are still there.

Instead, when writing multithreaded code, you must write the code with the attitude that you are going to avoid writing the bugs in the first place. If you have properly removed the data processing code, the thread handling code should be small enough - preferably a few lines, at worst a few dozen lines - that you have a chance of writing it without writing a bug, and certainly without writing many bugs, if you understand threading, take your time, and are careful.

Writing unit tests for multithreaded code

Once the multithreaded code is written as carefully as possible, it is still worthwhile writing tests for that code. The primary purpose of the tests is not so much to test for highly timing dependent race condition bugs - it's impossible to test for such race conditions repeatably - but rather to test that your locking strategy for preventing such bugs allows for multiple threads to interact as intended.

To properly test correct locking behavior, a test must start multiple threads. To make the test repeatable, we want the interactions between the threads to happen in a predictable order. We don't want to externally synchronize the threads in the test, because that will mask bugs that could happen in production where the threads are not externally synchronized. That leaves the use of timing delays for thread synchronization, which is the technique that I have used successfully whenever I've had to write tests of multithreaded code.

If the delays are too short, then the test becomes fragile, because minor timing differences - say between different machines on which the tests may be run - may cause the timing to be off and the test to fail. What I've typically done is start with delays that cause test failures, increase the delays so that the test passes reliably on my development machine, and then double the delays beyond that so the test has a good chance of passing on other machines. This does mean that the test will take a macroscopic amount of time, though in my experience, careful test design can limit that time to no more than a dozen seconds. Since you shouldn't have very many places requiring thread coordination code in your application, that should be acceptable for your test suite.

Finally, keep track of the number of bugs caught by your test. If your test has 80% code coverage, it can be expected to catch about 80% of your bugs. If your test is well designed but finds no bugs, there's a reasonable chance that you don't have additional bugs that will only show up in production. If the test catches one or two bugs, you might still get lucky. Beyond that, and you may want to consider a careful review of or even a complete rewrite of your thread handling code, since it is likely that code still contains hidden bugs that will be very difficult to find until the code is in production, and very difficult to fix then.

查看更多
登录 后发表回答