Evaluating memory bandwidth
“So, how fast can memory to memory copy can get ?” I asked myself. And the result, a simple program that does that and I was surprised to see the graphed results. Here’s what I did, allocated 2 x 1.6 GB or buffers and mlocked them (yes, I have 4 GB of physical RAM) and did a memcpy across them, with varying block sizes and the result is,
Here’s data from another machine of mine, with ~2G physical RAM. I had to limit myself to using 2 x 800 MB buffers here.
From these two it seems that the maximum bandwidth is achieved by using a block size which is around half, though not exactly but definitely not less than half (see the huge drop in bandwidth to it’s left). I guess this behaviour is due to the implementation of memcpy in glibc (v2.11.1-1.x86_64). I haven’t looked at the sources yet, so I don’t know how to explain this, yet.
I’m including the program source here, please do leave a comment!
/* Dumb memory bandwidth measurer
Best run in runlevel 1
*/
#include <stdio.h>
#include <sys/time.h>
#define MB * (1024 * 1024)
#define KB * (1024)
int main()
{
int i, j;
struct timeval t1, t2, t3;
char *a, *b;
int nsteps, max;
int mem = 1500;
a = malloc(mem MB);
b = malloc(mem MB);
mlock(a, mem MB);
mlock(b, mem MB);
nsteps = 160;
max = mem;
for (j = 1; j <= max; j += max / nsteps) {
gettimeofday(&t1, NULL);
for (i = 0; i < mem / j; i++)
memcpy(a + i * j MB, b + i * j MB, j MB);
gettimeofday(&t2, NULL);
timersub(&t2, &t1, &t3);
printf("%d %f\n", j, mem / ((t3.tv_usec / 1000000.0) + t3.tv_sec * 1.0));
}
}
Credit : Data visualization using gnuplot.
Read Full Post | Make a Comment ( 2 so far )





