Do any compilers transfer effective type through m

2019-04-08 08:11发布

问题:

According to N1570 6.5/6:

If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.

That would suggest that even on a system where "long" and some other integer type have the same representation, the following would invoke Undefined Behavior:

#if ~0UL == ~0U
  #define long_equiv int
#elif ~0UL == ~0ULL
  #define long_equiv long long
#else
#error Oops
#endif
long blah(void)
{
  long l;
  long_equiv l2;
  long_equiv *p = malloc(sizeof (long));
  l = 1234;
  memcpy(p, &l, sizeof (long));
  l2 = *p;
  free(p); // Added to address complaint about leak
  return l2;
}

since the data pointed to by l clearly has effective type long and the object pointed to by p has no declared type, the memcpy should set the effective type of the storage to long. Since reading use of an lvalue of type long_equiv to read an object with effective type of long is not allowed, the code would invoke Undefined Behavior.

Given that prior to C99 memcpy was one of the standard ways to copy data of one type to storage of another type, the new rules about memcpy cause a lot of existing code to invoke Undefined Behavior. If the rule had instead been that using memcpy to write to allocated storage leaves the destination without any effective type, the behavior would be defined.

Are there any compilers which do not behave as though memcpy leaves the effective type of the destination unset when used to copy information to allocated storage, or should use of memcpy for purposes of data translation be considered "safe"? If some compilers do apply effective type of the source to the destination, what would be the proper way of copying data in type-agnostic fashion? What is meant by "copied as an array of character type"?

回答1:

The C standard says that the effective type is transferred. Therefore, by definition, all conforming compilers transfer the effective type.

Your code sample causes undefined behaviour by violating the strict aliasing rule, because a value of effective type long is read by an lvalue of type long long.

This was also true in C89, I'm not sure what you refer to about "new rules in C99" (other than the fact that long long was not in C89).

It is true that when C was standardized, some existing code had undefined behaviour. And it is also true that people continue to write code with undefined behaviour.

What is meant by "copied as an array of character type"?

This means copying character-by-character using a character type.

what would be the proper way of copying data in type-agnostic fashion?

It's not possible to "erase effective type", so far as I know. To correctly read a value using a long long *, you must be pointing to a location of effective type long long.

In your code, for example:

// If we have confirmed that long and long long have the same size and representation
long long x;
memcpy(&x, p, sizeof x);
return x;

Union aliasing is another option.

If you don't like all this then compile with -fno-strict-aliasing.



回答2:

Experimentally, gcc 6.2 behaves in ways which would only be justifiable by regarding memmove as transferring the effective type of the source to the destination. If gcc can determine that the source and destination pointers match, it will treat the memory operand as only being readable via its earlier Effective Type, rather than as memory which was last written using a character type and may thus be accessed using any type. Such behavior would be unjustifiable without the rule that allows memcpy to transfer effective-type information.

On the other hand, gcc's behavior is sometimes not justifiable under any rule, so it's not necessarily clear whether gcc's behavior is a consequence of its authors' interpretation of the Standard, or whether it's simply broken. For example, if it can determine that the destination target of memcpy contains the same constant bit pattern as the source, it will treat the memcpy as a no-op even if the source held the type that would next be used to read the destination storage, and the destination held a different type the compiler had decided couldn't alias the next read.