Justin's answer on another question made an observation that I find very interesting but can't quite explain. Consider the following code:
std::vector<std::string> v;
v.push_back("Hello, world!"); // Doesn't call strlen.
v.emplace_back("Hello, world!"); // Calls strlen.
If you look at the assembly, emplace_back
generates a call to strlen
, whereas push_back
does not (tested in gcc 8.1 and clang 6.0 using -Ofast
).
Why is this happening? Why can't emplace_back
optimize out the strlen
call here? My initial thought was that push_back
is implicitly creating the std::string
before the function call (so the std::string
constructor is directly passed the string literal, which is optimally handled), whereas emplace_back
creates the std::string
after the function call (so the std::string
constructor is forwarded the string literal, which I presumed had decayed from a const char [N]
to a const char *
, thus requiring a strlen
call).
But emplace_back
takes a T&&
parameter, and my tests show that the string literal shouldn't be decaying to a pointer here. Clearly I'm overlooking something.
The
strlen
call is in the out-of-line function body for the slow path; that function body must be valid for all arguments of typeconst char (&)[42]
(in your godbolt example), including arguments that did not originate from a string literal of 41 characters with no embedded nulls.The fast path is inlined into
foo
, and it does compute the length at compile time.