According to the gcc docs, memcmp is not an intrinsic function of GCC. If you wanted to speed up glibc's memcmp under gcc, you would need to use the lower level intrinsics defined in the docs. However, when searching around the internet, it seems that many people have the impression that memcmp is a builtin function. Is it for some compilers and not for others?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
Your link appears to be for the x86 architecture-specific built-in functions, according to this memcmp is implemented as an architecture-independent built-in by gcc.
Edit:
Compiling the following code with Cygwin gcc version 3.3.1 for i686, -O2:
#include <stdlib.h>
struct foo {
int a;
int b;
} ;
int func(struct foo *x, struct foo *y)
{
return memcmp(x, y, sizeof (struct foo));
}
Produces the following output (note that the call to memcmp() is converted to an 8-byte "repz cmpsb"):
0: 55 push %ebp
1: b9 08 00 00 00 mov $0x8,%ecx
6: 89 e5 mov %esp,%ebp
8: fc cld
9: 83 ec 08 sub $0x8,%esp
c: 89 34 24 mov %esi,(%esp)
f: 8b 75 08 mov 0x8(%ebp),%esi
12: 89 7c 24 04 mov %edi,0x4(%esp)
16: 8b 7d 0c mov 0xc(%ebp),%edi
19: f3 a6 repz cmpsb %es:(%edi),%ds:(%esi)
1b: 0f 92 c0 setb %al
1e: 8b 34 24 mov (%esp),%esi
21: 8b 7c 24 04 mov 0x4(%esp),%edi
25: 0f 97 c2 seta %dl
28: 89 ec mov %ebp,%esp
2a: 5d pop %ebp
2b: 28 c2 sub %al,%dl
2d: 0f be c2 movsbl %dl,%eax
30: c3 ret
31: 90 nop
回答2:
Note that the repz cmpsb routine might not be faster than glibc's memcmp. In my tests, in fact, it's never faster, even when comparing just a few bytes.
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
回答3:
Now in 2017, GCC and Clang seems to have some optimizations for buffers of sizes 1, 2, 4, 8 and some others, for example 3, 5 and multiple of 8.