The documentation of std::hypot
says that:
Computes the square root of the sum of the squares of x and y, without undue overflow or underflow at intermediate stages of the computation.
I struggle to conceive a test case where std::hypot
should be used over the trivial sqrt(x*x + y*y)
.
The following test shows that std::hypot
is roughly 20x slower than the naive calculation.
#include <iostream>
#include <chrono>
#include <random>
#include <algorithm>
int main(int, char**) {
std::mt19937_64 mt;
const auto samples = 10000000;
std::vector<double> values(2 * samples);
std::uniform_real_distribution<double> urd(-100.0, 100.0);
std::generate_n(values.begin(), 2 * samples, [&]() {return urd(mt); });
std::cout.precision(15);
{
double sum = 0;
auto s = std::chrono::steady_clock::now();
for (auto i = 0; i < 2 * samples; i += 2) {
sum += std::hypot(values[i], values[i + 1]);
}
auto e = std::chrono::steady_clock::now();
std::cout << std::fixed <<std::chrono::duration_cast<std::chrono::microseconds>(e - s).count() << "us --- s:" << sum << std::endl;
}
{
double sum = 0;
auto s = std::chrono::steady_clock::now();
for (auto i = 0; i < 2 * samples; i += 2) {
sum += std::sqrt(values[i]* values[i] + values[i + 1]* values[i + 1]);
}
auto e = std::chrono::steady_clock::now();
std::cout << std::fixed << std::chrono::duration_cast<std::chrono::microseconds>(e - s).count() << "us --- s:" << sum << std::endl;
}
}
So I'm asking for guidance, when must I use std::hypot(x,y)
to obtain correct results over the much faster std::sqrt(x*x + y*y)
.
Clarification: I'm looking for answers that apply when x
and y
are floating point numbers. I.e. compare:
double h = std::hypot(static_cast<double>(x),static_cast<double>(y));
to:
double xx = static_cast<double>(x);
double yy = static_cast<double>(y);
double h = std::sqrt(xx*xx + yy*yy);
The answer is in the documentation you quoted
If
x*x + y*y
overflows, then if you carry out the calculation manually, you'll get the wrong answer. If you usestd::hypot
, however, it guarantees that the intermediate calculations will not overflow.You can see an example of this disparity here.
If you are working with numbers which you know will not overflow the relevant representation for your platform, you can happily use the naive version.