C++ so far (unfortunately) doesn't support finally
clause for a try
statement. This leads to speculations on how to release resources. After studying the question on the internet, although I found some solutions, I didn't get clear about their performance (and I would use Java if performance didn't matter that much). So I had to benchmark.
The options are:
Functor-based
finally
class proposed at CodeProject. It's powerful, but slow. And the disassembly suggests that outer function local variables are captured very inefficiently: pushed to the stack one by one, rather than passing just the frame pointer to the inner (lambda) function.RAII: Manual cleaner object on the stack: the disadvantage is manual typing and tailoring it for each place used. Another disadvantage is the need to copy to it all the variables needed for resource release.
MSVC++ specific
__try
/__finally
statement. The disadvantage is that it's obviously not portable.
I created this small benchmark to compare the runtime performance of these approaches:
#include <chrono>
#include <functional>
#include <cstdio>
class Finally1 {
std::function<void(void)> _functor;
public:
Finally1(const std::function<void(void)> &functor) : _functor(functor) {}
~Finally1() {
_functor();
}
};
void BenchmarkFunctor() {
volatile int64_t var = 0;
const int64_t nIterations = 234567890;
auto start = std::chrono::high_resolution_clock::now();
for (int64_t i = 0; i < nIterations; i++) {
Finally1 doFinally([&] {
var++;
});
}
auto elapsed = std::chrono::high_resolution_clock::now() - start;
double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
printf("Functor: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}
void BenchmarkObject() {
volatile int64_t var = 0;
const int64_t nIterations = 234567890;
auto start = std::chrono::high_resolution_clock::now();
for (int64_t i = 0; i < nIterations; i++) {
class Cleaner {
volatile int64_t* _pVar;
public:
Cleaner(volatile int64_t& var) : _pVar(&var) { }
~Cleaner() { (*_pVar)++; }
} c(var);
}
auto elapsed = std::chrono::high_resolution_clock::now() - start;
double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
printf("Object: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}
void BenchmarkMSVCpp() {
volatile int64_t var = 0;
const int64_t nIterations = 234567890;
auto start = std::chrono::high_resolution_clock::now();
for (int64_t i = 0; i < nIterations; i++) {
__try {
}
__finally {
var++;
}
}
auto elapsed = std::chrono::high_resolution_clock::now() - start;
double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
printf("__finally: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}
template <typename Func> class Finally4 {
Func f;
public:
Finally4(Func&& func) : f(std::forward<Func>(func)) {}
~Finally4() { f(); }
};
template <typename F> Finally4<F> MakeFinally4(F&& f) {
return Finally4<F>(std::forward<F>(f));
}
void BenchmarkTemplate() {
volatile int64_t var = 0;
const int64_t nIterations = 234567890;
auto start = std::chrono::high_resolution_clock::now();
for (int64_t i = 0; i < nIterations; i++) {
auto doFinally = MakeFinally4([&] { var++; });
//Finally4 doFinally{ [&] { var++; } };
}
auto elapsed = std::chrono::high_resolution_clock::now() - start;
double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
printf("Template: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}
void BenchmarkEmpty() {
volatile int64_t var = 0;
const int64_t nIterations = 234567890;
auto start = std::chrono::high_resolution_clock::now();
for (int64_t i = 0; i < nIterations; i++) {
var++;
}
auto elapsed = std::chrono::high_resolution_clock::now() - start;
double nSec = 1e-6 * std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
printf("Empty: %.3lf Ops/sec, var=%lld\n", nIterations / nSec, (long long)var);
}
int __cdecl main() {
BenchmarkFunctor();
BenchmarkObject();
BenchmarkMSVCpp();
BenchmarkTemplate();
BenchmarkEmpty();
return 0;
}
The results on my Ryzen 1800X @3.9Ghz with DDR4 @2.6Ghz CL13 were:
Functor: 175148825.946 Ops/sec, var=234567890
Object: 553446751.181 Ops/sec, var=234567890
__finally: 553832236.221 Ops/sec, var=234567890
Template: 554964345.876 Ops/sec, var=234567890
Empty: 554468478.903 Ops/sec, var=234567890
Apparently, all the options except functor-base (#1) are as fast as an empty loop.
So is there a fast and powerful C++ alternative to finally
, which is portable and requires minimum copying from the stack of the outer function?
UPDATE: I've benchmarked @Jarod42 solution, so here in the question is updated code and output. Though as mentioned by @Sopel, it may break if copy elision is not performed.
UPDATE2: To clarify what I'm asking for is a convenient fast way in C++ to execute a block of code even if an exception is thrown. For the reasons mentioned in the question, some ways are slow or inconvenient.