Main memory bandwidth measurement

I want to measure the main memory bandwidth and while looking for the methodology, I found that,

many used 'bcopy' function to copy bytes from a source to destination and then measure the time which they report as the bandwidth.
Others ways of doing it is to allocate and array and walk through the array (with some stride) - this basically gives the time to read the entire array.

I tried doing (1) for data size of 1GB and the bandwidth I got is '700MB/sec' (I used rdtsc to count the number of cycles elapsed for the copy). But I suspect that this is not correct because my RAM config is as follows:

Speed: 1333 MHz
Bus width: 32bit

As per wikipedia, the theoretical bandwidth is calculated as follows:

clock speed * bus width * # bits per clock cycle per line (2 for ddr 3 ram) 1333 MHz * 32 * 2 ~= 8GB/sec.

So mine is completely different from the estimated bandwidth. Any idea of what am I doing wrong?

=========

Other question is, bcopy involves both read and write. So does it mean that I should divide the calculated bandwidth by two to get only the read or only the write bandwidth? I would like to confirm whether the bandwidth is just the inverse of latency? Please suggest any other ways of measuring the bandwidth.

标签： bandwidth

1条回答

淡お忘

2楼-- · 2019-05-10 01:32

I can't comment on the effectiveness of bcopy, but the most straightforward approach is the second method you stated (with a stride of 1). Additionally, you are confusing bits with bytes in your memory bandwidth equation. 32 bits = 4bytes. Modern computers use 64 bit wide memory buses. So your effective transfer rate (assuming DDR3 tech)

1333Mhz * 64bit/(8bits/byte) = 10666MB/s (also classified as PC3-10666)

The 1333Mhz already has the 2 transfer/clock factored in.

Check out the wiki page for more info: http://en.wikipedia.org/wiki/DDR3_SDRAM

Regarding your results, try again with the array access. Malloc 1GB and traverse the entire thing. You can sum each element of the array and print it out so your compiler doesn't think it's dead code.

Something like this:

double time;
int size = 1024*1024*1024;
int sum;
*char *array = (char*)malloc(size);
//start timer here
for(int i=0; i < size; i++)
  sum += array[i];
//end timer
printf("time taken: %f \tsum is %d\n", time, sum);

0人赞添加讨论(0) 举报

Main memory bandwidth measurement

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间