What are you favorite low level code optimization

2020-05-19 02:44发布

I know that you should only optimize things when it is deemed necessary. But, if it is deemed necessary, what are your favorite low level (as opposed to algorithmic level) optimization tricks.

For example: loop unrolling.

24条回答
smile是对你的礼貌
2楼-- · 2020-05-19 03:10

Don't do loop unrolling. Don't do Duff's device. Make your loops as small as possible, anything else inhibits x86 performance and gcc optimizer performance.

Getting rid of branches can be useful, though - so getting rid of loops completely is good, and those branchless math tricks really do work. Beyond that, try never to go out of the L2 cache - this means a lot of precalculation/caching should also be avoided if it wastes cache space.

And, especially for x86, try to keep the number of variables in use at any one time down. It's hard to tell what compilers will do with that kind of thing, but usually having less loop iteration variables/array indexes will end up with better asm output.

Of course, this is for desktop CPUs; a slow CPU with fast memory access can precalculate a lot more, but in these days that might be an embedded system with little total memory anyway…

查看更多
走好不送
3楼-- · 2020-05-19 03:11

Picking a power of two for filters, circular buffers, etc.

So very, very convenient.

-Adam

查看更多
姐就是有狂的资本
4楼-- · 2020-05-19 03:11

Inspect the compiler's output, then try to coerce it to do something faster.

查看更多
▲ chillily
5楼-- · 2020-05-19 03:11

Using template metaprogramming to calculate things at compile time instead of at run-time.

查看更多
再贱就再见
6楼-- · 2020-05-19 03:11

Allocating with new on a pre-allocated buffer using C++'s placement new.

查看更多
Lonely孤独者°
7楼-- · 2020-05-19 03:12

Years ago with a not-so-smart compilier, I got great mileage from function inlining, walking pointers instead of indexing arrays, and iterating down to zero instead of up to a maximum.

When in doubt, a little knowledge of assembly will let you look at what the compiler is producing and attack the inefficient parts (in your source language, using structures friendlier to your compiler.)

查看更多
登录 后发表回答