Into the Unknown: Self-Learning Large Language Models

828 篇文章

已下架不支持订阅

本文介绍了一种自学习大型语言模型(LLM)的框架,该框架让LLM能通过自我评估学习新知识。提出了未知点(PiU)的概念及识别方法,创建自学习循环以填补知识空白。实验表明,经过微调的模型能有效地进行自学习,这种方法可能提高人工智能的信任度。

本文是LLM系列文章,针对《Into the Unknown: Self-Learning Large Language Models》的翻译。

摘要

我们解决了自学习LLM的主要问题:学习什么的问题。我们提出了一个自学习LLM框架,使LLM能够通过对自己幻觉的自我评估独立学习以前未知的知识。利用幻觉评分,我们引入了一个新的未知点(PiUs)概念,以及一种外在和三种内在的自动识别PiUs的方法。它有助于创建一个自学习循环,专门关注未知点中的知识差距,从而降低幻觉得分。我们还开发了用于衡量LLM自学习能力的评估指标。我们的实验表明,经过微调或调整的7B Mistral模型能够很好地进行自学习。我们的自学习概念允许更高效的LLM更新,并为知识交流开辟了新的视角。这也可能增加公众对人工智能的信任。

1 引言

2 相关工作

3 为什么需要自学习

4 已知与未知、未知中的点(PiU)的概念

5 PiU的识别方法

6 自学习LLM

7 自我学习能力的衡量标准

8 实验

9 讨论

10 结论

在这项工作中,我

已下架不支持订阅

### Chain-of-Thought Prompting Mechanism in Large Language Models In large language models, chain-of-thought prompting serves as a method to enhance reasoning capabilities by guiding the model through structured thought processes. This approach involves breaking down complex problems into simpler components and providing step-by-step guidance that mirrors human cognitive processing. The creation of these prompts typically includes selecting examples from training datasets where each example represents part of an overall problem-solving process[^2]. By decomposing tasks into multiple steps, this technique encourages deeper understanding and more accurate predictions compared to traditional methods. For instance, when faced with multi-hop question answering or logical deduction challenges, using such chains allows models not only to generate correct answers but also articulate intermediate thoughts leading up to those conclusions. Such transparency facilitates better interpretability while improving performance on various NLP benchmarks. ```python def create_chain_of_thought_prompt(task_description, examples): """ Creates a chain-of-thought prompt based on given task description and examples. Args: task_description (str): Description of the task at hand. examples (list): List containing tuples of input-output pairs used for demonstration purposes. Returns: str: Formatted string representing the final prompt including both instructions and sample cases. """ formatted_examples = "\n".join([f"Input: {ex[0]}, Output: {ex[1]}" for ex in examples]) return f""" Task: {task_description} Examples: {formatted_examples} Now try solving similar questions following above pattern. """ # Example usage examples = [ ("What color do you get mixing red and blue?", "Purple"), ("If it rains tomorrow, will we have our picnic?", "No") ] print(create_chain_of_thought_prompt("Solve logic puzzles", examples)) ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UnknownBody

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值