TurboFan的故事

2017年2月份Google宣布将会在Chrome 59中启用Ignition+TurboFan流水线。距离Google在2013年开始TurboFan计划已经过去了三年半。在2013年末时,TurboFan团队坚信需要处理Crankshaft的代码生成问题并且throw more sophisticated peak performance optimizations at JavaScript code。We based most of these findings on JavaScript code we hit in certain benchmarks like Octane and investigations of asm.js based applications, but also on findings from important web pages like Google Maps.这些都被认为是真实世界表现中的一种较好的模拟,因为它们给优化编译器带来了很大的压力。但是现在往回看,当时的许多想法都是错误的。尽管更聪明的编译器能更好适应Octane中的许多测试,现实是对于绝大多数网站来说优化编译器并不重要,有时甚至会使得性能受损——因为特定类型的优化有时间开销——尤其是在移动设备上的页面加载阶段。

但是TurboFan研发的第一年团队没有考虑真实世界中的场景。最初的目标是构建一个全语言编译器,能够在asm.js类似的代码上运行良好——这两个目标都是Crankshaft无法做到的。在Chrome 41中我们shipped TurboFan for asm.js代码。这个TurboFan的初始版本已经包含了大量smartness。We basically got to Firefox level of asm.js performance with a more general approach。Most of the type based optimizations for fast arithmetic would work equally well in general JavaScript. 从个人视角看,那个时候的TurboFan优化编译器可能是最优美的版本,这也是唯一的一个版本(对于一个JavaScript编译器来说)where I could imagine that a “sea of nodes” approach might make sense (although it was already showing its weakness at that time)。在接下来的几个月内我们尝试find incremental ways to turn TurboFan into a viable, general drop-in replacement for Crankshaft。But we struggled to find another subset of JavaScript that would be possible to tackle independently, similar to how we started with asm.js。

在2015年中期我们开始意识到TurboFan可能在解决我们不会遇到的问题,and that we might need to go back to the drawing board to figure out why V8 is struggling in the wild. We weren’t really engaging the community at that time, and my personal response when developers brought problems to my attention was often negative and along the lines of “your JavaScript does odd things”, which over time turned into “if your code is slow in V8, you wrote slow code” in people’s minds. So taking a step back, and trying to understand the full picture, I slowly realized that a lot of the suffering arose from performance cliffs in V8. Phrased differently, we were over-focused on the peak performance case, and baseline performance was a blind spot.

这种对于平衡的缺失会导致高度不可预测的性能。例如,当JavaScript代码遵循特定的模式,⸺ avoid all kinds of performance killers, keep everything monomorphic, limit the number of hot functions ⸺ you’ll be able to squeeze awesome performance out of V8, easily beating Java performance on similar code. But as soon as you leave this fine line of awesome performance, you often immediately fall off a steep cliff.

V8就像悬崖一样性能变化极其明显,在过去,100倍的性能变化是很常见的。Of these cliffs, the arguments object handling in Crankshaft is probably the one which people hit most often and which is the most frustrating too。Crankshaft中的一个最基本的假设是arguments object does not escape, and thus Crankshaft does not need to materialize the actual JavaScript arguments object ever, but instead can just take the parameters from the activation record. So, in other words, there’s no safety net. It’s all or nothing. Let’s consider this simple dispatching logic:

var callbacks = [
  function sloppy() {},
  function strict() {
    "use strict";
  }
];

function dispatch() {
  for (var l = callbacks.length, i = 0; i < l; ++i) {
    callbacks[i].apply(null, arguments);
  }
}

for (var i = 0; i < 100000; ++i) {
  dispatch(1, 2, 3, 4, 5);
}

Looking at it naively, it seems to follow the rules for the arguments object in Crankshaft: In dispatch we only use arguments together with Function.prototype.apply. Yet, running this simple example.js in node tells us that all optimizations are disabled for dispatch:

$ node --trace-opt example.js
...
[marking 0x353f56bcd659 <JS Function dispatch (SharedFunctionInfo 0x187ffee58fc9)> for optimized recompilation, reason: small function, ICs with typeinfo: 6/7 (85%), generic ICs: 0/7 (0%)]
[compiling method 0x353f56bcd659 <JS Function dispatch (SharedFunctionInfo 0x187ffee58fc9)> using Crankshaft]
[disabled optimization for 0x167a24a58fc9 <SharedFunctionInfo dispatch>, reason: Bad value context for arguments value]

The infamous Bad value context for arguments value reason. So, what is the problem here? Despite the code following the rules for arguments object, it falls off the performance cliff. The real reason is pretty subtle: Crankshaft can only optimize fn.apply(receiver,arguments) if it knows that fn.apply is Function.prototype.apply, and it only knows that for monomorphic fn.apply property accesses. That is, fn has to have exactly the same hidden class ⸺ map in V8 terminology ⸺ all the time. But callbacks[0] and callbacks[1] have different maps, since callbacks[0] is a sloppy mode function, whereas callbacks[1] is a strict mode function:

$ cat example2.js
var callbacks = [
  function sloppy() {},
  function strict() { "use strict"; }
];
console.log(%HaveSameMap(callbacks[0], callbacks[1]));
$ node --allow-natives-syntax example2.js
false

TurboFan on the other hand happily optimizes dispatch (using the latest Node.js LKGR):

$ node --trace-opt --future example.js
[marking 0x20fa7d04cee9 <JS Function dispatch (SharedFunctionInfo 0x27431e85d299)> for optimized recompilation, reason: small function, ICs with typeinfo: 6/6 (100%), generic ICs: 0/6 (0%)]
[compiling method 0x20fa7d04cee9 <JS Function dispatch (SharedFunctionInfo 0x27431e85d299)> using TurboFan]
[optimizing 0x1c22925834d9 <JS Function dispatch (SharedFunctionInfo 0x27431e85d299)> - took 0.526, 0.513, 0.069 ms]
[completed optimizing 0x1c22925834d9 <JS Function dispatch (SharedFunctionInfo 0x27431e85d299)>]
...
根据原作 https://pan.quark.cn/s/459657bcfd45 的源码改编 Classic-ML-Methods-Algo 引言 建立这个项目,是为了梳理和总结传统机器学习(Machine Learning)方法(methods)或者算法(algo),和各位同仁相互学习交流. 现在的深度学习本质上来自于传统的神经网络模型,很大程度上是传统机器学习的延续,同时也在不少时候需要结合传统方法来实现. 任何机器学习方法基本的流程结构都是通用的;使用的评价方法也基本通用;使用的一些数学知识也是通用的. 本文在梳理传统机器学习方法算法的同时也会顺便补充这些流程,数学上的知识以供参考. 机器学习 机器学习是人工智能(Artificial Intelligence)的一个分支,也是实现人工智能最重要的手段.区别于传统的基于规则(rule-based)的算法,机器学习可以从数据中获取知识,从而实现规定的任务[Ian Goodfellow and Yoshua Bengio and Aaron Courville的Deep Learning].这些知识可以分为四种: 总结(summarization) 预测(prediction) 估计(estimation) 假想验证(hypothesis testing) 机器学习主要关心的是预测[Varian在Big Data : New Tricks for Econometrics],预测的可以是连续性的输出变量,分类,聚类或者物品之间的有趣关联. 机器学习分类 根据数据配置(setting,是否有标签,可以是连续的也可以是离散的)和任务目标,我们可以将机器学习方法分为四种: 无监督(unsupervised) 训练数据没有给定...
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值