gprof是一个用来统计可执行程序数据的工具.它能帮助用户确定程序在哪个地方耗时过多.
被编译器生成指令后的代码将会被gprof捕获并进行计算,在gcc编译过程中使用-pg选项编译就会产生指令化工具,当执行编译后的二进制程序时,就会生成一个profile信息的概要文件.
gprof就是用这个文件对程序进行分析的,没有用-pg生成指令化工具的代码是不可测量的.
下面是一个演示程序,如下:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>
double slow(double x)
{
return pow(x,1.12345);//计算x的y次幂
}
double slower(double x)
{
return 1.0/x;
}
double slowest(double x)
{
return sqrt(x);//平方根
}
int main (int argc, char *argv[])
{
int i;
double x;
for (i = 0;i < 3000000; i++){
x = 100.0;
x = slow(x);
x = slower(x);
x = slowest(x);
}
}
我们要用-pg选项对它进行编译,同时为达到最好的效率,我们加上-O2.
编译:
gcc -lm -O2 -pg test.c -o test
执行test程序,生成gmon.out文件
查一下gmon.out的文件类型:
file gmon.out
gmon.out: GNU prof performance data - version 1
执行gprof对程序进行分析:
gprof ./test
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ns/call ns/call name
43.48 0.10 0.10 3000000 33.33 33.33 slowest
21.74 0.15 0.05 main
17.39 0.19 0.04 3000000 13.33 13.33 slow
17.39 0.23 0.04 3000000 13.33 13.33 slower
% the percentage of the total running time of the
time program used by this function.
cumulative a running sum of the number of seconds accounted
seconds for by this function and those listed above it.
self the number of seconds accounted for by this
seconds function alone. This is the major sort for this
listing.
calls the number of times this function was invoked, if
this function is profiled, else blank.
self the average number of milliseconds spent in this
ms/call function per call, if this function is profiled,
else blank.
total the average number of milliseconds spent in this
ms/call function and its descendents per call, if this
function is profiled, else blank.
name the name of the function. This is the minor sort
for this listing. The index shows the location of
the function in the gprof listing. If the index is
in parenthesis it shows where it would appear in
the gprof listing if it were to be printed.
Call graph (explanation follows)
granularity: each sample hit covers 4 byte(s) for 4.35% of 0.23 seconds
index % time self children called name
<spontaneous>
[1] 100.0 0.05 0.18 main [1]
0.10 0.00 3000000/3000000 slowest [2]
0.04 0.00 3000000/3000000 slow [3]
0.04 0.00 3000000/3000000 slower [4]
-----------------------------------------------
0.10 0.00 3000000/3000000 main [1]
[2] 43.5 0.10 0.00 3000000 slowest [2]
-----------------------------------------------
0.04 0.00 3000000/3000000 main [1]
[3] 17.4 0.04 0.00 3000000 slow [3]
-----------------------------------------------
0.04 0.00 3000000/3000000 main [1]
[4] 17.4 0.04 0.00 3000000 slower [4]
-----------------------------------------------
此后省略
注:
gprof产生的信息
% the percentage of the total running time of the
time program used by this function.
函数使用时间占所有时间的百分比。
cumulative a running sumof the number of seconds accounted
seconds for by this function and those listed above it.
函数和上列函数累计执行的时间。
self the number of seconds accounted for by this
seconds function alone. This is the major sort for this
listing.
函数本身所执行的时间。
calls the number of times this function was invoked, if
this function is profiled, else blank.
函数被调用的次数
self the average number of milliseconds spent in this
ms/call function per call, if this function is profiled,
else blank.
每一次调用花费在函数的时间microseconds。
total the average number of milliseconds spent in this
ms/call function and its descendents per call, if this
function is profiled, else blank.
每一次调用,花费在函数及其衍生函数的平均时间microseconds。
name the name of the function. This is the minor sort
for this listing. The index shows the location of
the function in the gprof listing. If the index is
in parenthesis it shows where it would appear in
the gprof listing if it were to be printed.
函数名
然后我们再来分析:
我们看到gprof默认会输出两组信息,即平台剖析数据(Flat profile)和调用图(Call graph).
我们只看平台剖析数据:
cumulative seconds:代表的是时间的累加,如上例,运行slowest函数(调用300万次)共用了0.01秒,此时cumulative seconds=0.01,
运行main函数用了0.05秒,此时cumulative seconds=0.15,运行slower函数(调用300万次)共用了0.04秒,此时cumulative seconds=0.19,
最后slow函数(调用300万次)共用了0.04秒,最后cumulative seconds=0.23.
self seconds:代表的是函数本身的运行时间,比如slowest它占用的时间最长,被调用300万次,共用了0.10秒.