首先帖出最简单的一个矩阵乘法的cpp代码:
#include <stdio.h>
#include <sys/time.h>
int main(int argc,char** argv)
{
int Width = 835;
int Height = 835;
struct timeval s;
long _start;
long _end;
int i;
int j;
//声明三个浮点数组
float inputA[Height][Width];
float inputB[Width][Height];
float output[Height][Height];
//gettimeofday(&s, 0);
//_start = (long )s.tv_sec * 1000 + (long)s.tv_usec / 1000;
//初始化
for(i = 0;i < Width; i = i + 1)
for(j = 0;j<Height; j = j+1)
{
inputA[i][j] = (float)(i*0.01f+j*0.03f);
inputB[i][j] = (float)(i*0.2f+j*0.1f);
output[i][j] = 0.0f;
}
//开始计时
gettimeofday(&s, 0);
_start = (long )s.tv_sec * 1000 + (long)s.tv_usec / 1000;
//矩阵乘法
for(i = 0;i < Width; i = i + 1)
for(j = 0;j<Height; j = j+1)
for(int k=0;k<Height;++k)
output[i][j] += inputA[i][k]*inputB[k][j];
//停止计时
gettimeofday(&s, 0);
_end = (long)s.tv_sec * 1000 + (long)s.tv_usec / 1000;
//输出结果
printf("The Program cost time:/n");
printf("%f/n",(double)(_end - _start)/1000);
return 0;
}
就这么个没有任何技术含量的程序,分别用icc和g++编译,运行的结果是
(时间单位为秒)
1.不开任何选项,
icc:
The Program cost time:
7.859000
g++:
The Program cost time:
13.468000
2.开-O2选项
icc:
The Program cost time:
7.913000
g++:
The Program cost time:
4.579000
3.开-O3选项
icc:
The Program cost time:
7.661000
g++:
The Program cost time:
4.849000
结果实在是令人困惑啊!难道Intel的优化还没有g++做的好吗?期盼高人给出答案。