inline function in different translation units wit

2019-04-07 16:00发布

问题:

in visual studio you can set different compiler options for individual cpp files. for example: under "code generation" we can enable basic runtime checks in debug mode. or we can change the floating point model (precise/strict/fast). these are just examples. there are plenty of different flags.

an inline function can be defined multiple times in the program, as long as the definitions are identical. we put this function into a header and include it in several translation units. now, what happens if different compiler options in different cpp files lead to slightly different compiled code for the function? then they do differ and we have undefined behaviour? you could make the function static (or put it into an unnamed namespace) but going further, every memberfunction defined directly in a class is implicit inline. this would mean that we may only include classes in different cpp files if these cpp files share the identical compiler flags. i can not imagine this to be true, because this would basically be to easy to get wrong.

are we really that fast in the land of undefined behaviour? or will compilers handle this cases?

回答1:

As far as the Standard is concerned, each combination of command-line flags turns a compiler into a different implementation. While it is useful for implementations to be able to use object files produced by other implementations, the Standard imposes no requirement that they do so.

Even in the absence of in-lining, consider having the following function in one compilation unit:

char foo(void) { return 255; }

and the following in another:

char foo(void);
int arr[128];
void bar(void)
{
  int x=foo();
  if (x >= 0 && x < 128)
     arr[x]=1;
}

If char was a signed type in both compilation units, the value of x in the second unit would be less than zero (thus skipping the array assignment). If it were an unsigned type in both units, it would be greater than 127 (likewise skipping the assignment). If one compilation unit used a signed char and the other used unsigned, however, and if the implementation expected return values to sign-extended or zero-extended in the result register, the result could be that a compiler might determine that x can't be greater than 127 even though it holds 255, or that it couldn't be less than 0 even though it holds -1. Consequently, the generated code might access arr[255] or arr[-1], with potentially-disastrous results.

While there are many cases where it should be safe to combine code using different compiler flags, the Standard makes no effort to distinguish those where such mixing is safe from those where it is unsafe.



回答2:

an inline function can be defined multiple times in the program, as long as the definitions are identical

No. ("Identical" isn't even a well defined concept here.)

Formally the definitions must be equivalent in some very strong sense, which doesn't even make sense as a requirement and which nobody cares about:

// in some header (included in multiple TU):

const int limit_max = 200; // implicitly static

inline bool check_limit(int i) {
  return i<=limit_max; // OK
}

inline int impose_limit(int i) {
  return std::min(i, limit_max); // ODR violation
}

Such code is entirely reasonable yet formally violates the one definition rule:

in each definition of D, corresponding names, looked up according to 6.4 [basic.lookup], shall refer to an entity defined within the definition of D, or shall refer to the same entity, after overload resolution (16.3 [over.match]) and after matching of partial template specialization (17.9.3 [temp.over]), except that a name can refer to a const object with internal or no linkage if the object has the same literal type in all definitions of D, and the object is initialized with a constant expression (8.20 [expr.const]), and the value (but not the address) of the object is used, and the object has the same value in all definitions of D;

Because the exception doesn't allow using a const object with internal linkage (the const int is implicitly static) for the purpose of directly binding a const reference (and then using the reference for its value only). The correct version is:

inline int impose_limit(int i) {
  return std::min(i, +limit_max); // OK
}

Here the value of limit_max is used in unary operator + and then a const reference is bound to a temporary initialized with that value. Who really does that?

But even the committee doesn't believe the formal ODR matters, as we can see in Core Issue 1511:

1511. const volatile variables and the one-definition rule

Section: 6.2 [basic.def.odr] Status: CD3 Submitter: Richard Smith Date: 2012-06-18

[Moved to DR at the April, 2013 meeting.]

This wording is possibly not sufficiently clear for an example like:

  const volatile int n = 0;
  inline int get() { return n; }

We see that the committee believes that this blatant violation of the intent and purpose of the ODR as written, a code that reads a different volatile object in each TU, that is a code that has a visible side effect on a different object, so a different visible side effect, is OK because we do not care which is which.

What matters is that the effect of the inline function is vaguely equivalent: doing a volatile int read, which is a very weak equivalence, but sufficient for the natural use of the ODR which is instance indifference: which specific instance of the inline function is used doesn't matter and can't make a difference.

In particular the value read by a volatile read is by definition not known by the compiler, so the post condition and invariants of this function as analysed by the compiler are the same.

When using different function definition in different TU, you need to make sure that these are strictly equivalent from the point of view of the caller: that it is never possible to surprise a caller by substituting one for the other. It means that the observable behavior must be strictly the same even if the code is different.

If you use different compiler options, they must not change the range of possible results of a function (possible as viewed by the compiler).

Because the "standard" (which isn't really a specification of a programming language) allows floating point objects to have a real representation not allowed by their officially declared type, in a completely unconstrained way, using any non volatile qualified floating point type in anything multiply defined subject to the ODR seems problematic, unless you activate the "double means double" mode (which is the only sane mode).