Which of these two dimensional array are advantage

2019-03-01 14:41发布

In a facebook group, I saw a question like :

If a row dominated two dimensional array in the following which one is advantage and why?

a) for(i=0;i<1000;i++)
      for(j=0;j<1000;j++)
           temp=temp+a[i][j];

b) for(j=0;j<1000;j++)
      for(i=0;i<1000;i++)
         temp=temp+a[i][j]

From my point I can see there is no difference in the above two statements. And I guess these both are same.

Correct me if am wrong?

标签: c++ c caching
5条回答
淡お忘
2楼-- · 2019-03-01 14:45

According to me The Correct One is 'A' :Because when we use array of 2-D it is easy for assessment Of element Row by Row to cache ,than Column by Column, Since Cache is used for fast Assessment ,if more hit than it behaves fast so first one is Correct, Thanx Have a nice day.

查看更多
做个烂人
3楼-- · 2019-03-01 14:46

Variant (a) is better, since it's much more cache-friendly, than (b): you access consecutive elements, so there are less cache misses. However, some compilers can rearrange your loops, so when compiled both variants might produce same machine codes.

Besides performance, there is no difference between this two variants, i.e. thay produce the same output.

查看更多
【Aperson】
4楼-- · 2019-03-01 14:52

When the CPU wants to read data/code from the memory, chunks of data or moved from the memory to cache. Cache is much faster than RAM and much more expensive, so you have little of it. The idea behind this is that usually when you read one part of memory, you probably read other parts that are close to it.

In method a, you read the array row by row, and therefore you keep the locality. That is, on the first read from a row, the row is loaded in cache, and the rest of the row is read from cache (cache hit), so you have a high cache hit rate which is good.

In method b, you are deliberately accessing the array in a non contiguous way, so you get a lot of cache misses, and you need to keep reading from memory all the time.

查看更多
SAY GOODBYE
5楼-- · 2019-03-01 14:53

There is no difference in theory.

The practical advantage is in cache locality. If you access locations which are far apart, you increase the number of cache misses, which in turn makes the code run slower.

Depending on your processor cache size, you would perhaps need to replace 1000 with some reasonably bigger number in order to perceive the effect.

查看更多
forever°为你锁心
6楼-- · 2019-03-01 15:03

However, there could very well be extenuating circumstances. For instance, if the numbers are both positive and negative there could be patterns to them such that one order would lead to an overflow (or loss of precision if floating point) where the other would be much better behaved.

查看更多
登录 后发表回答