Two loop bodies or one (result identical)-第2页回答

Two loop bodies or one (result identical)

2019-06-16 02:56发布

I have long wondered what is more efficient with regards to making better use of CPU caches (which are known to benefit from locality of reference) - two loops each iterating over the same mathematical set of numbers, each with a different loop body, or having one loop that "concatenates" the two bodies into one, and thus accomplishes identical total result, but all in itself?

In my opinion, having two loops would introduce fewer cache misses and evictions because more instructions and data used by the loop fit in the cache. Am I right?

Assuming:

Cost of f and g each is negligible compared to cost of completing the entire loop containing each
f and g use most of the cache each by itself, and so the cache will be invalidated by one being called after another (which would be the case with a single-loop-body version)
Intel Core Duo CPU
C language source code
gcc compiler, no switches

The set being iterated over is a mathematical set, not a container of numbers in memory like a vector or a list. See the example below.

Please no answers of the "premature optimization is evil" character :-)

An example of the two-loops version that I am advocating for:

int j = 0, k = 0;

for(int i = 0; i < 1000000; i++)
{
    j += f(i);
}

for(int i = 0; i < 1000000; i++)
{
    k += g(i);
}

标签： c optimization caching cpu micro-optimization

7条回答

等我变得足够好

2楼-- · 2019-06-16 03:57

This seems like something the compiler could optimize for you so instead of trying to figure it out yourself and making it fast, use whatever method makes your code more clear and readable. If you really must know, time both methods for input size and calculation type that your application uses (try the code you have now but repeat your calculations many many times and disable optimization).

0人赞添加讨论(0) 举报

上一页 1 2

Two loop bodies or one (result identical)

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间