之前要处理一个25W行的文件,用shell,慢的简直不能忍,即使优化了把那种通过管道启动新程序的脚本,如echo 'aaa' |grep xxx这种去掉,也用了7分钟,
于是乎,我又拿起了2年前用的perl重写一份,耗时0.6秒 !!!
real 0m0.647s
user 0m0.560s
sys 0m0.032s
本着我是c++爱好者的兴趣,我用c++重写了一份,发现还是0.5-0.6秒,怎么c++也没比perl快多少嘛
改编译参数,加了个-O3,发现速度没多少变化
real 0m0.560s
user 0m0.236s
sys 0m0.276s
我想,肯定是我写的有问题
看到sys 用了0.27,比perl的0.03多很多,我就猜测可能是io方面没有写好
后来,我发现我写的程序有诸如cout<< "xxx"<<endl;获取此处有问题
于是乎我改成了cout<<"xxx\n";
效率马上提高了
real 0m0.205s
user 0m0.160s
sys 0m0.024s
以上
my $PREV_TIME="";
my $SUM=0;
my $INIT=1;
my $line;
my $TIME;
my $TPS;
my $MSG;
while(<>)
{
# Remove the line break
chomp;
$line=$_;
# Skip blank line
if (!$line){next;}
$TIME=substr($line,0,8);
$TPS=substr($line,10,99);
# Handle TPS Log is Enabled/Disabled
$MSG=substr($TPS,0,10);
if ( $MSG eq "TPS Log is" )
{
print "$line\n";
next;
}
$TPS=substr($TPS,0,rindex($TPS, "TPS")-1);
if ($INIT==1)
{
$PREV_TIME=$TIME;
$INIT=0;
}
if ($PREV_TIME eq $TIME)
{
$SUM=$SUM+$TPS;
}else
{
print "$PREV_TIME $SUM TPS\n";
$SUM=$TPS;
}
$PREV_TIME=$TIME;
}
print "$PREV_TIME $SUM TPS\n";
string PREV_TIME;
int SUM = 0;
int INIT = 1;
string line;
string TIME;
string TPS;
fstream ifs;
string MSG;
int tps_number = 0;
ifs.open("/var/tmp/sorted.tmp");
while (!std::getline(ifs, line).eof())
{
if (line == "")
continue;
TIME = line.substr(0, 8);
TPS = line.substr(10, 99);
MSG = TPS.substr(0, 10);
if (MSG == "TPS Log is")
{
cout << line << endl;
continue;
}
int index = TPS.rfind("TPS");
tps_number = atoi(TPS.substr(0, index - 1).c_str());
if (INIT == 1)
{
PREV_TIME = TIME;
INIT = 0;
}
if (PREV_TIME == TIME)
{
SUM = SUM + tps_number;
}
else
{
cout << PREV_TIME << " " << SUM << " " << "TPS\n";
SUM = tps_number;
}
PREV_TIME = TIME;
}
cout << PREV_TIME << " " << SUM << " " << "TPS\n";