[Meta Summary] A stream of paper in probing Transformer Language Models

本文链接：https://blog.youkuaiyun.com/weixin_43928665/article/details/118641602

Probing作为一种神经网络解释方式，也被用于评估表示学习的效果，尽管其评估的准确性颇具争议。本文介绍了Pimental 2020年关于Pareto Probing的工作，定义了Probing是训练监督分类器以检测预训练模型中关于语言的'知识'。同时，从信息论角度，Probing被视为估计表示和语言属性之间的互信息。作者还引用了Rogers 2020年的BERTology综述，该文汇总了通过Probing对BERT的深入理解。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一开始知道Probing这个方向大概就是从Voita的NLP with friends talk吧，当时模糊地理解为是一种"neural network interpretation"的方式，其实这种理解没有错误，只是它只说中了一半，这周读了若干篇paper along this stream, 发现probing还有一个目的是serve as evaluation metrics for representation learning. 不过这就更玄学了，比NLG的evaluation还要玄学。起码NLG人是可以像判作文一样手判的，而至于一个representation学的好不好nobody knows。这就直接引向了probing里面最为老铁扎心的问题： When a probe achieves high acc on a linguistic task using a representation, can we conclude that the representation encodes linguistic structure, or has the probe just learned the task?

Anyway我还是想引出[Pimental 2020 Pareto Probing]里面对probing的定义，这样一提到probing我就不再只是理解个大概但又说不出个所以然了：

We define in this work as training a supervised classifier (known as a probe) on top of pretrained models’ frozen representations. By analyzing the classifier’s performance, one can access how much ‘knowledge’ the representations contain about language.

From a information-theoretic view, [Pimental 2020 Info-theoretic Probing] sees probing as:

estimating the mutual information between a representation-valued random variable and a linguistic property-valued random variable.

吐血整理一个thread因为之前不了解所以读的很慢整理得也很慢，希望未来某个时刻发现现在花的功夫都能用在神奇的地方！

本来想用这篇blog当做整个thread的meta summary的，但是文字只能写线性思路，还是直接贴上思维导图把~

最后[Rogers 2020 A Primer in BERTology] 汇总了What have we learnt about BERT from the numerous works of probing. 里面有pointer到各种类型的probing / BERTology works很齐全~