mips指令优化:__inline__不起效果

程序实例(lock.h):

static __inline__ int
tas(volatile slock_t *lock)
{
	register volatile slock_t *_l = lock;
	register int _res;
	register int _tmp;

	__asm__ __volatile__(
		"       .set push           \n"
		"       .set mips2          \n"
		"       .set noreorder      \n"
		"       .set nomacro        \n"
		"       ll      %0, %2      \n"
		"       or      %1, %0, 1   \n"
		"       sc      %1, %2      \n"
		"       xori    %1, 1       \n"
		"       or      %0, %0, %1  \n"
		"       sync                \n"
		"       .set pop              "
:		"=&r" (_res), "=&r" (_tmp), "+R" (*_l)
:		/* no inputs */
:		"memory");
	return _res;
}

代码本身来看已经没有优化空间。MIPS架构使用ll/sc指令完成原子操作。

但是gcc默认优化等级是1,导致使用上面函数的__inline__声明没有起作用,这就导致会多出更多的指令。可以在编译时添加“-O3”来优化代码,使__inlinie__起效果来减少运行时的指令数。但这个tas如果调用频繁,“-O3”可能会膨胀你的代码量。代码量的膨胀必然导致程序运行过程中cache命中率。

这需要我们要全盘评估整个软件项目里使用tas的频率和代码量,来决定是否使用“-O3”参数。

附:gcc手册中对-O3的定义:

-O2
  Optimize even more. GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. As compared to ‘-O’, this option increases both compilation time and the performance of the generated code.
  ‘-O2’ turns on all optimization flags specified by ‘-O’. It also turns on the
  following optimization flags:
  -fthread-jumps
  -falign-functions -falign-jumps
  -falign-loops -falign-labels
  -fcaller-saves
  -fcrossjumping
  -fcse-follow-jumps -fcse-skip-blocks
  -fdelete-null-pointer-checks
  -fdevirtualize -fdevirtualize-speculatively
  -fexpensive-optimizations
  -fgcse -fgcse-lm
  -fhoist-adjacent-loads
  -finline-small-functions
  -findirect-inlining
  -fipa-sra
  -fisolate-erroneous-paths-dereference
  -foptimize-sibling-calls
  -fpartial-inlining
  -fpeephole2
  -freorder-blocks -freorder-functions
  -frerun-cse-after-loop
  -fsched-interblock -fsched-spec
  -fschedule-insns -fschedule-insns2
  -fstrict-aliasing -fstrict-overflow
  -ftree-switch-conversion -ftree-tail-merge
  -ftree-pre
  -ftree-vrp

-O3 Optimize yet more.

   ‘-O3’ turns on all optimizations spec-ified by ‘-O2’ and also turns on the
   ‘-finline-functions’,
   ‘-funswitch-loops’,
   ‘-fpredictive-commoning’,
   ‘-fgcse-after-reload’,
   ‘-ftree-loop-vectorize’, ‘-ftree-slp-vectorize’, ‘-fvect-cost-model’,
   ‘-ftree-partial-pre’ and ‘-fipa-cp-clone’ options.

-finline-functions
   Consider all functions for inlining, even if they are not declared inline. The compiler heuristically decides which functions are worth integrating in this way. If all calls to a given function are integrated, and the function is declared static, then the function is normally not output as assembler code in its own right.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

海棠花败

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值