Why use bzero over memset?

2019-01-12 17:18发布

站内文章 / C

65 0

我命由我不由天

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

In a Systems Programming class I took this previous semester, we had to implement a basic client/server in C. When initializing the structs, like sock_addr_in, or char buffers (that we used to send data back and forth between client and server) the professor instructed us to only use bzero and not memset to initialize them. He never explained why, and I'm curious if there is a valid reason for this?

I see here: http://fdiv.net/2009/01/14/memset-vs-bzero-ultimate-showdown that bzero is more efficient due to the fact that is only ever going to be zeroing memory, so it doesn't have to do any additional checking that memset may do. That still doesn't necessarily seem like a reason to absolutely not use memset for zeroing memory though.

bzero is considered deprecated, and furthermore is a not a standard C function. According to the manual, memset is preferred over bzero for this reason. So why would you want to still use bzero over memset? Just for the efficiency gains, or is it something more? Likewise, what are the benefits of memset over bzero that make it the de facto preferred option for newer programs?

回答1:

I don't see any reason to prefer bzero over memset.

memset is a standard C function while bzero has never been a C standard function. The rationale is probably because you can achieve exactly the same functionality using memset function.

Now regarding efficiency, compilers like gcc use builtin implementations for memset which switch to a particular implementation when a constant 0 is detected. Same for glibc when builtins are disabled.

回答2:

I'm guessing you used (or your teacher was influenced by) UNIX Network Programming by W. Richard Stevens. He uses bzero frequently instead of memset, even in the most up-to-date edition. The book is so popular, I think it's become an idiom in network programming which is why you still see it used.

I would stick with memset simply because bzero is deprecated and reduces portability. I doubt you would see any real gains from using one over the other.

回答3:

The one advantage that I think bzero() has over memset() for setting memory to zero is that there's a reduced chance of a mistake being made.

More than once I've come across a bug that looked like:

memset(someobject, size_of_object, 0);    // clear object

The compiler won't complain (though maybe cranking up some warning levels might on some compilers) and the effect will be that the memory isn't cleared. Because this doesn't trash the object - it just leaves it alone - there's a decent chance that the bug might not manifest into anything obvious.

The fact that bzero() isn't standard is a minor irritant. (FWIW, I wouldn't be surprised if most function calls in my programs are non-standard; in fact writing such functions is kind of my job).

In a comment to another answer here, Aaron Newton cited the following from Unix Network Programming, Volume 1, 3rd Edition by Stevens, et al., Section 1.2 (emphasis added):

bzero is not an ANSI C function. It is derived from early Berkely networking code. Nevertheless, we use it throughout the text, instead of the ANSI C memset function, because bzero is easier to remember (with only two arguments) than memset (with three arguments). Almost every vendor that supports the sockets API also provides bzero, and if not, we provide a macro definition in our unp.h header.

Indeed, the author of TCPv3 [TCP/IP Illustrated, Volume 3 - Stevens 1996] made the mistake of swapping the second and third arguments to memset in 10 occurrences in the first printing. A C compiler cannot catch this error because both arguments are of the same type. (Actually, the second argument is an int and the third argument is size_t, which is typically an unsigned int, but the values specified, 0 and 16, respectively, are still acceptable for the other type of argument.) The call to memset still worked, because only a few of the socket functions actually require that the final 8 bytes of an Internet socket address structure be set to 0. Nevertheless, it was an error, and one that could be avoided by using bzero, because swapping the two arguments to bzero will always be caught by the C compiler if function prototypes are used.

I also believe that the vast majority of calls to memset() are to zero memory, so why not use an API that is tailored to that use case?

A possible drawback to bzero() is that compilers might be more likely to optimize memcpy() because it's standard and so they might be written to recognize it. However, keep in mind that correct code is still better than incorrect code that's been optimized. In most cases, using bzero() will not cause a noticeable impact on your program's performance, and that bzero() can be a macro or inline function that expands to memcpy().

回答4:

In short: memset require more assembly operations then bzero.

This is the source: http://fdiv.net/2009/01/14/memset-vs-bzero-ultimate-showdown

回答5:

Wanted to mention something about bzero vs. memset argument. Install ltrace and then compare what it does under the hood. On Linux with libc6 (2.19-0ubuntu6.6), the calls made are exactly the same (via ltrace ./test123):

long m[] = {0}; // generates a call to memset(0x7fffefa28238, '\0', 8)
int* p;
bzero(&p, 4);   // generates a call to memset(0x7fffefa28230, '\0', 4)

I've been told that unless I am working in the deep bowels of libc or any number of kernel/syscall interface, I don't have to worry about them. All I should worry about is that the call satisfy the requirement of zero'ing the buffer. Others have mentioned about which one is preferable over the other so I'll stop here.

回答6:

You probably shouldn't use bzero, it's not actually standard C, it was a POSIX thing.

And note that word "was" - it was deprecated in POSIX.1-2001 and removed in POSIX.1-2008 in deference to memset so you're better off using the standard C function.

回答7:

Have it any way you like. :-)

#ifndef bzero
#define bzero(d,n) memset((d),0,(n))
#endif

Note that:

The original bzero returns nothing, memset returns void pointer (d). This can be fixed by adding the typecast to void in the definition.
#ifndef bzero does not prevent you from hiding the original function even if it exists. It tests the existence of a macro. This may cause lots of confusion.
It’s impossible to create a function pointer to a macro. When using bzero via function pointers, this will not work.

回答8:

For memset function, the second argument is an int and the third argument is size_t,

void *memset(void *s, int c, size_t n);

which is typically an unsigned int, but if the values like, 0 and 16 for second and third argument respectively are entered in wrong order as 16 and 0 then, such a call to memset can still work, but will do nothing. Because the number of bytes to initialize are specified as 0.

void bzero(void *s, size_t n)

Such an error can be avoided by using bzero, because swapping the two arguments to bzero will always be caught by the C compiler if function prototypes are used.