Do modern compilers optimize the x * 2 operation t

2020-02-18 09:21发布

Does the C++ compiler optimize the multiply by two operation x*2 to a bitshift operation x<<1?

I would love to believe that yes.

13条回答
Deceive 欺骗
2楼-- · 2020-02-18 10:01

It depends on what compiler you have. Visual C++ for example is notoriously poor in optimizing. If you edit your post to say what compiler you are using, it would be easier to answer.

查看更多
我命由我不由天
3楼-- · 2020-02-18 10:03

Not if x is a float it won't.

查看更多
爱情/是我丢掉的垃圾
4楼-- · 2020-02-18 10:07

I'm sure they all do these kind of optimizations, but I wonder if they are still relevant. Older processors did multiplication by shifting and adding, which could take a number of cycles to complete. Modern processors, on the other hand, have a set of barrel-shifters which can do all the necessary shifts and additions simultaneously in one clock cycle or less. Has anyone actually benchmarked whether these optimizations really help?

查看更多
地球回转人心会变
5楼-- · 2020-02-18 10:10

VS 2008 optimized mine to x << 1.

    x = x * 2;
004013E7  mov         eax,dword ptr [x] 
004013EA  shl         eax,1 
004013EC  mov         dword ptr [x],eax 

EDIT: This was using VS default "Debug" configuration with optimization disabled (/Od). Using any of the optimization switches (/O1, /O2 (VS "Retail"), or /Ox) results in the the add self code Rob posted. Also, just for good measure, I verified x = x << 1 is indeed treated the same way as x = x * 2 by the cl compiler in both /Od and /Ox. So, in summary, cl.exe version 15.00.30729.01 for x86 treats * 2 and << 1 identically and I expect nearly all other recent compilers do the same.

查看更多
SAY GOODBYE
6楼-- · 2020-02-18 10:11

Actually VS2008 optimizes this to x+x:

01391000  push        ecx  
    int x = 0;

    scanf("%d", &x);
01391001  lea         eax,[esp] 
01391004  push        eax  
01391005  push        offset string "%d" (13920F4h) 
0139100A  mov         dword ptr [esp+8],0 
01391012  call        dword ptr [__imp__scanf (13920A4h)] 

    int y = x * 2;
01391018  mov         ecx,dword ptr [esp+8] 
0139101C  lea         edx,[ecx+ecx] 

In an x64 build it is even more explicit and uses:

    int y = x * 2;
000000013FB9101E  mov         edx,dword ptr [x] 

    printf("%d", y);
000000013FB91022  lea         rcx,[string "%d" (13FB921B0h)] 
000000013FB91029  add         edx,edx 

This is will the optimization settings on 'Maximize speed' (/O2)

查看更多
手持菜刀,她持情操
7楼-- · 2020-02-18 10:12

@Ferruccio Barletta

That's a good question. I went Googling to try to find the answer.

I couldn't find answers for Intel processors directly, but this page has someone who tried to time things. It shows shifts to be more than twice as fast as ads and multiplies. Bit shifts are so simple (where a multiply could be a shift and an addition) that this makes sense.

So then I Googled AMD, and found an old optimization guide for the Athlon from 2002 that lists that lists the fastest ways to multiply numbers by contants between 2 and 32. Interestingly, it depends on the number. Some are ads, some shifts. It's on page 122.

A guide for the Athlon 64 shows the same thing (page 164 or so). It says multiplies are 3 (in 32-bit) or 4 (in 64-bit) cycle operations, where shifts are 1 and adds are 2.

It seems it is still useful as an optimization.

Ignoring cycle counts though, this kind of method would prevent you from tying up the multiplication execution units (possibly), so if you were doing lots of multiplications in a tight loop where some use constants and some don't the extra scheduling room might be useful.

But that's speculation.

查看更多
登录 后发表回答