While learning c I have implemented my own memcpy functions. I have used a wider type( uint32_t
) in the function. (For simplicity the function is restricted to types that are multiples of 4 and the data is properly aligned )
void memcpy4( void* dst , void* src , int size )
{
size /= 4;
for ( int i = 0 ; i < size ; i++ )
((uint32_t*)dst)[i] = ((uint32_t*)src)[i];
}
I did some reading on type punning and strict aliasing and I believe the function above breaks the rule. The correct implementation would be this since you can use a char:
void memcpy4( void* dst , void* src , int size )
{
for ( int i = 0 ; i < size ; i++ )
((char *)dst)[i] = ((char *)src)[i];
}
I tried to do some casting trough an union, but that turned out to be invalid as well.
How could such function be implemented with a wider type and not break the strict aliasing rule?
The way to implement memcpy
using more than single-byte copies is to use non-standard C.
Standard C does not support implementing memcpy
using other than character types.
Quality C implementations provide an optimized memcpy
implementation that performs efficient copying using more than single-byte copies, but they use implementation-specific code to do so. They may do this by compiling the memcpy
implementation with a switch such as -fnostrict-aliasing
to tell the compiler the aliasing rules will be violated in the code, by relying on known features of the specific C implementation to ensure the code will work (if you write the compiler, you can design it so that your implementation of memcpy
works), or by writing memcpy
in assembly language.
Additionally, C implementations may optimize memcpy
calls where they appear in source code, replacing them by direct instructions to perform the operation or by simply changing the internal semantics of the program. (E.g., if you copy a
into b
, the compiler might not perform a copy at all but might simply load from a
where subsequent code accesses b
.)
To implement your own specialized copy operation while violating aliasing rules, compile it with -fnostrict-aliasing
, if you are using GCC or Clang. If you are using another compiler, check its documentation for an option to disable the aliasing rules. (Note: Apple’s GCC, which I use, disables strict aliasing by default and accepts -fstrict-aliasing
but not -fnostrict-aliasing
. I am presuming non-Apple GCC accepts -fnostrict-aliasing
.)
If you are using a good C implementation, you may find that your four-byte-copy implementation of memcpy4
does not perform as well as the native memcpy
, depending on circumstances.