What are you favorite low level code optimization-第4页回答

I know that you should only optimize things when it is deemed necessary. But, if it is deemed necessary, what are your favorite low level (as opposed to algorithmic level) optimization tricks.

For example: loop unrolling.

标签： optimization

24条回答

SAY GOODBYE

2楼-- · 2020-05-19 03:25

++i can be faster than i++, because it avoids creating a temporary.

Whether this still holds for modern C/C++/Java/C# compilers, I don't know. It might well be different for user-defined types with overloaded operators, whereas in the case of simple integers it probably doesn't matter.

But I've come to like the syntax... it reads like "increment i" which is a sensible order.

0人赞添加讨论(0) 举报

冷血范

3楼-- · 2020-05-19 03:25

Optimizing cache locality - for example when multiplying two matrices that don't fit into cache.

0人赞添加讨论(0) 举报

Explosion°爆炸

4楼-- · 2020-05-19 03:25

Rolling up loops.

Seriously, the last time I needed to do anything like this was in a function that took 80% of the runtime, so it was worth trying to micro-optimize if I could get a noticeable performance increase.

The first thing I did was to roll up the loop. This gave me a very significant speed increase. I believe this was a matter of cache locality.

The next thing I did was add a layer of indirection, and put some more logic into the loop, which allowed me to only loop through the things I needed. This wasn't as much of a speed increase, but it was worth doing.

If you're going to micro-optimize, you need to have a reasonable idea of two things: the architecture you're actually using (which is vastly different from the systems I grew up with, at least for micro-optimization purposes), and what the compiler will do for you.

A lot of the traditional micro-optimizations trade space for time. Nowadays, using more space increases the chances of a cache miss, and there goes your performance. Moreover, a lot of them are now done by modern compilers, and typically better than you're likely to do them.

Currently, you should (a) profile to see if you need to micro-optimize, and then (b) try to trade computation for space, in the hope of keeping as much as possible in cache. Finally, run some tests, so you know if you've improved things or screwed them up. Modern compilers and chips are far too complex for you to keep a good mental model, and the only way you'll know if some optimization works or not is to test.

0人赞添加讨论(0) 举报

走好不送

5楼-- · 2020-05-19 03:27

I was amazed at the speedup I got by replacing a for loop adding numbers together in structs:

const unsigned long SIZE = 100000000;

typedef struct {
    int a;
    int b;
    int result;
} addition;

addition *sum;

void start() {
    unsigned int byte_count = SIZE * sizeof(addition);

    sum = malloc(byte_count);
    unsigned int i = 0;

    if (i < SIZE) {
        do {
            sum[i].a = i;
            sum[i].b = i;
            i++;
        } while (i < SIZE);
    }    
}

void test_func() {
    unsigned int i = 0;

    if (i < SIZE) { // this is about 30% faster than the more obvious for loop, even with O3
        do {
            addition *s1 = &sum[i];
            s1->result = s1->b + s1->a;
            i++;
        } while ( i<SIZE );
    }
}

void finish() {
    free(sum);
}

Why doesn't gcc optimise for loops into this? Or is there something I missed? Some cache effect?

0人赞添加讨论(0) 举报

看我几分像从前

6楼-- · 2020-05-19 03:29

Liberal use of __restrict to eliminate load-hit-store stalls.

0人赞添加讨论(0) 举报

放荡不羁爱自由

7楼-- · 2020-05-19 03:31

Jon Bentley's Writing Efficient Programs is a great source of low- and high-level techniques -- if you can find a copy.

0人赞添加讨论(0) 举报

上一页 1 2 3 4

What are you favorite low level code optimization

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间