What is the fastest way to swap values in C?

2020-01-24 03:41发布

I want to swap two integers, and I want to know which of these two implementations will be faster: The obvious way with a temp variable:

void swap(int* a, int* b)
{
    int temp = *a;
    *a = *b;
    *b = temp;
}

Or the xor version that I'm sure most people have seen:

void swap(int* a, int* b)
{
    *a ^= *b;
    *b ^= *a;
    *a ^= *b;
}

It seems like the first uses an extra register, but the second one is doing three loads and stores while the first only does two of each. Can someone tell me which is faster and why? The why being more important.

标签: c performance
21条回答
别忘想泡老子
2楼-- · 2020-01-24 04:07

Another beautiful way.

#define Swap( a, b ) (a)^=(b)^=(a)^=(b)

Advantage

No need of function call and handy.

Drawback:

This fails when both inputs are same variable. It can be used only on integer variables.

查看更多
成全新的幸福
3楼-- · 2020-01-24 04:08

In my opinion local optimizations like this should only be considered tightly related to the platform. It makes a huge difference if you are compiling this on a 16 bit uC compiler or on gcc with x64 as target.

If you have a specific target in mind then just try both of them and look at the generated asm code or profile your applciation with both methods and see which is actually faster on your platform.

查看更多
时光不老,我们不散
4楼-- · 2020-01-24 04:09

For those to stumble upon this question and decide to use the XOR method. You should consider inlining your function or using a macro to avoid the overhead of a function call:

#define swap(a, b)   \
do {                 \
    int temp = a;    \
    a = b;           \
    b = temp;        \
} while(0)
查看更多
贪生不怕死
5楼-- · 2020-01-24 04:10

The first is faster because bitwise operations such as xor are usually very hard to visualize for the reader.

Faster to understand of course, which is the most important part ;)

查看更多
可以哭但决不认输i
6楼-- · 2020-01-24 04:12

All the top rated answers are not actually definitive "facts"... they are people who are speculating!

You can definitively know for a fact which code takes less assembly instructions to execute because you can look at the output assembly generated by the compiler and see which executes in less assembly instructions!

Here is the c code I compiled with flags "gcc -std=c99 -S -O3 lookingAtAsmOutput.c":

#include <stdio.h>
#include <stdlib.h>

void swap_traditional(int * restrict a, int * restrict b)
{
    int temp = *a;
    *a = *b;
    *b = temp;
}

void swap_xor(int * restrict a, int * restrict b)
{
    *a ^= *b;
    *b ^= *a;
    *a ^= *b;
}

int main() {
    int a = 5;
    int b = 6;
    swap_traditional(&a,&b);
    swap_xor(&a,&b);
}

ASM output for swap_traditional() takes >>> 11 <<< instructions ( not including "leave", "ret", "size"):

.globl swap_traditional
    .type   swap_traditional, @function
swap_traditional:
    pushl   %ebp
    movl    %esp, %ebp
    movl    8(%ebp), %edx
    movl    12(%ebp), %ecx
    pushl   %ebx
    movl    (%edx), %ebx
    movl    (%ecx), %eax
    movl    %ebx, (%ecx)
    movl    %eax, (%edx)
    popl    %ebx
    popl    %ebp
    ret
    .size   swap_traditional, .-swap_traditional
    .p2align 4,,15

ASM output for swap_xor() takes >>> 11 <<< instructions not including "leave" and "ret":

.globl swap_xor
    .type   swap_xor, @function
swap_xor:
    pushl   %ebp
    movl    %esp, %ebp
    movl    8(%ebp), %ecx
    movl    12(%ebp), %edx
    movl    (%ecx), %eax
    xorl    (%edx), %eax
    movl    %eax, (%ecx)
    xorl    (%edx), %eax
    xorl    %eax, (%ecx)
    movl    %eax, (%edx)
    popl    %ebp
    ret
    .size   swap_xor, .-swap_xor
    .p2align 4,,15

Summary of assembly output:
swap_traditional() takes 11 instructions
swap_xor() takes 11 instructions

Conclusion:
Both methods use the same amount of instructions to execute and therefore are approximately the same speed on this hardware platform.

Lesson learned:
When you have small code snippets, looking at the asm output is helpful to rapidly iterate your code and come up with the fastest ( i.e. least instructions ) code. And you can save time even because you don't have to run the program for each code change. You only need to run the code change at the end with a profiler to show that your code changes are faster.

I use this method a lot for heavy DSP code that needs speed.

查看更多
男人必须洒脱
7楼-- · 2020-01-24 04:12

Below piece of code will do the same. This snippet is optimized way of programming as it doesn't use any 3rd variable.

  x = x ^ y;
  y = x ^ y;
  x = x ^ y;
查看更多
登录 后发表回答