I'm working on a project which is in serious need of some performance tuning.
How do I write a test that fails if my optimizations do not in improve the speed of the program?
To elaborate a bit:
The problem is not discovering which parts to optimize. I can use various profiling and benchmarking tools for that.
The problem is using automated tests to document that a specific optimization did indeed have the intended effect. It would also be highly desirable if I could use the test suite to discover possible performance regressions later on.
I suppose I could just run my profiling tools to get some values and then assert that my optimized code produces better values. The obvious problem with that, however, is that benchmarking values are not hard values. They vary with the local environment.
So, is the answer to always use the same machine to do this kind of integration testing? If so, you would still have to allow for some fuzziness in the results, since even on the same hardware benchmarking results can vary. How then to take this into account?
Or maybe the answer is to keep older versions of the program and compare results before and after? This would be my preferred method, since it's mostly environment agnostic. Does anyone have experience with this approach? I imagine it would only be necessary to keep one older version if all the tests can be made to pass if the performance of the latest version is at least as good as the former version.
In many server applications (might not be your case) performance problem manifest only under concurrent access and under load. Measuring absolute time a routine executes and trying to improve it is therefore not very helpful. There are problems with this method even in single-threaded applications. Measuring absolute routine time relies on the clock the platform is providing, and these are not always very precise; you better rely on average time a routine takes.
My advice is:
While you could use unit tests to establish some non functional aspects of your application, I think that the approach given above will give better results during optimization process. When placing time-related assertions in your unit tests you will have to choose some very approximative values: time can vary depending on the environment you are using to run your unit tests. You don't want tests to fail only because some of your colleagues are using inferior hardware.
Tuning is all about finding right things to tune. You already have a functioning code, so placing performance related assertions a posteriori and without establishing critical sections of code might lead you to waste a lot of time on optimizing non-essential pieces of your application.
kent beck and his team automated all the tests in TDD.
here for performance testing also we can automate the tests in TDD.
the criteria here in performance testing is we should test the yes or no cases
if we know the specfications well n good we can automate them also in TDD
Not faced this situation yet ;) however if I did, here's how I'd go about it. (I think I picked this up from Dave Astel's book)
Step#1: Come up with a spec for 'acceptable performance' so for example, this could mean 'The user needs to be able to do Y in N secs (or millisecs)'
Step#2: Now write a failing test.. Use your friendly timer class (e.g. .NET has the StopWatch class) and
Assert.Less(actualTime, MySpec)
Step#3: If the test already passes, you're done. If red, you need to optimize and make it green. As soon as the test goes green, the performance is now 'acceptable'.
I suspect that applying TDD to drive performance is a mistake. By all means, use it to get to good design and working code, and use the tests written in the course of TDD to ensure continued correctness - but once you have well-factored code and a solid suite of tests, you are in good shape to tune, and different (from TDD) techniques and tools apply.
TDD gives you good design, reliable code, and a test coverage safety net. That puts you into a good place for tuning, but I think that because of the problems you and others have cited, it's simply not going to take you much further down the tuning road. I say that as a great fan and proponent of TDD and a practitioner.
First you need to establish some criteria for acceptable performance, then you need to devise a test that will fail that criteria when using the existing code, then you need to tweak your code for performance until it passes the test. You will probably have more than one criteria for performance, and you should certainly have more than one test.
Run the tests + profiling in CI server. You can also run load tests periodically.
You are concerned about differences (as you mentioned), so its not about defining an absolute value. Have an extra step that compares the performance measures of this run with the one of the last build, and report on the differences as %. You can raise a red flag for important variations of time.
If you are concerned on performance, you should have clear goals you want to meet and assert them. You should measure those with tests on the full system. Even if your application logic is fast, you might have issues with the view causing you to miss the goal. You can also combine it with the differences approach, but for these you would have less tolerance to time variations.
Note that you can run the same process in your dev computer, just using only the previous runs in that computer and not a shared one between developers.