【论文阅读】语言模型何时需要检索增强

When Do LMs Need Retrieval Augmentation

A curated list of awesome papers about when do language models (LMs) need retrieval augmentation. This repository will be continuously updated. If I missed any papers, feel free to open a PR to include them! And any feedback and contributions are welcome!

Trigger retrieval when language model can not provide correct answers. Therefore, many work focuses on determining whether the model can provide a correct answer.

LMs’ Perception of Their Knowledge Boundaries

These methods focus on determining whether the model can provide a correct answer but do not perform adaptive Retrieval-Augmented Generation (RAG).

White-box Investigation

These methods require access to the full set of model parameters, such as for model training or using internal signals of the model.

Training The Language Model
  • [EMNLP 2020, Token-prob-based] Calibration of Pre-trained Transformers Shrey Desai et.al. 17 Mar 2020

    Investigate calibration in pre-trained transformer models & in-domain and OOD settings. Find: 1) Pre-trained models are calibrated in-domain. 2) Label smooth is better that temperature scaling in OOD setting

  • [TACL 2021, Token-prob-based] How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering Zhengbao Jiang et.al. 2 Dec 2020

    1)Investigate calibration (answerr: not good) in generative language models (e.g., T5) in QA task (OOD settings). 2) Examine the effectiveness of some methods (fine-tuning, post-hoc probability modification, or adjustment of the predicted outputs or inputs)

  • [TMLR 2022] Teaching Models to Express Their Uncertainty in Words Stephanie Lin et.al. 28 May 2022

    The first time a model has been shown to express calibrated uncertainty about its own answers in natural language. For testing calibration, we introduce the CalibratedMath suite of tasks

  • [ACL 2023] A Close Look into the Calibration of Pre-trained Language Models Yangyi Chen et.al. 31 Oct 2022

    Answer two questions: (1) Do PLMs learn to become calibrated in the training process? (No) (2) How effective are existing calibration methods? (learnable methods significantly reduce PLMs’ confidence in wrong predictions)

  • [NeurIPS 2024] Alignment for Honesty Yuqing Yang et.al. 12 Dec 2023

    1)Establishing a precise problem definition and defining “honesty” 2)introduce a flexible training framework which emphasize honesty without sacrificing performance on other tasks

Utilizing Internal States or Attention Weights

These papers focus on determining the truth of a statement or the model’s ability to provide a correct answer by analyzing the model’s internal states or attention weights. It usually involves using mathematical methods to extract features or training a lightweight MLP (Multi-Layer Perceptron).

Grey-box Investigation

Need to access to the probability of generated tokens. Some methods also rely on the probability of generated tokens; however, since training is involved in the paper, they do not fall into this category.

Black-box Investigation

These methods only require access to the model’s text output.

Adaptive RAG

These methods focus directly on the “when to retrieve”, designing strategies and evaluating their effectiveness in Retrieval-Augmented Generation (RAG).

后续更新将在github上进行:https://github.com/ShiyuNee/Awesome-When-To-Retrieve-Papers

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

长命百岁️

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值