Instructing Large Language Models to Identify and Ignore Irrelevant Conditions

本文提出I3C方法,教导大型语言模型在解决数学单词问题时识别和忽略不相关的条件,以提高推理准确性。通过验证不相关条件,结合小样本提示策略I3C Select,实现在多个数据集上的性能提升。

本文是LLM系列文章,针对《Instructing Large Language Models to Identify and Ignore Irrelevant Conditions》的翻译。

指导大型语言模型识别和忽略无关条件

摘要

数学单词问题(MWP)的求解需要根据给定的问题描述生成推理路径,该描述通常包含不相关的条件。现有的思想链(CoT)提示方法引发了大型语言模型(LLM)解决MWP的多步骤推理能力。然而,他们被不相关的条件严重混淆,导致准确性低。在本文中,我们提出了一种名为I3C的新方法,该方法指示LLM识别和忽略不相关的条件。它确定了一组与问题语义相关性较弱的不相关条件候选者。然后,它会提示LLM验证不相关的条件。最后,它指导LLM对相关和不相关的条件进行验证,以避免混淆并改进推理路径。此外,我们建议选择(问题,推理路径)对作为演示,以增强I3C的小样本推理。我们开发了I3C Select,它基于语义相关性测量来选择最令人困惑的问题。我们在八个MWP数据集上进行了广泛的实验。I3C可以与任何CoT提示方法相结合,以提高解决MWP的性能。值得注意的是,使用GPT-3.5-Turbo和I3CSelect,我们在GSM-IC2-1K和GSM-ICM-1K上分别实现了96.0和94.1的准确度,显著优于最先进的小样本提示方法ComplexCoT+11.7和+11.1。我们的实现公开于https://wzy6642.github.io/I3C.github.io/

1 引言

2 相关工作

3 提出的方法

4 实验

### CPU Core-to-Core Latency Measurement and Optimization Core-to-core latency refers to the time taken for one processing unit within a multi-core processor to communicate with another. For AMD's Ryzen CPUs, using the Epyc model definition in QEMU can influence how these cores interact by defining structures and caches of the CPU[^1]. However, specific measurements require dedicated tools or methods. #### Tools and Methods for Measuring Core-to-Core Latency To accurately measure core-to-core latency: - **Performance Counters**: Utilize hardware performance counters available through Linux perf events. ```bash sudo perf stat -e cycles,instructions ./your_application ``` - **Micro-benchmarks**: Develop micro-benchmark applications specifically designed to test inter-core communication times. These programs typically involve sending messages between different cores while measuring response times. - **Oprofile/Valgrind Callgrind**: Although primarily profiling utilities, they offer insights into program execution paths which might indirectly help understand latencies involved during context switches across multiple cores. #### Factors Influencing Core-to-Core Communication Delay Several factors contribute significantly towards increasing delays when communicating among cores inside modern processors like those found in Ryzen series: - **Cache Coherency Protocols**: Ensuring cache consistency requires additional overhead especially under heavy workloads where frequent updates occur simultaneously over several levels deep hierarchies such as L1D/L2 shared amongst siblings before reaching unified last level Cache (LLC). - **NUMA Architecture Effects**: Non-uniform memory access architectures may introduce asymmetry depending upon relative positions occupied by each node containing both compute resources alongside local DRAM banks leading potentially higher penalties whenever remote accesses become necessary outside immediate vicinity. - **Interconnect Bandwidth & Contention Issues**: Limited bandwidth coupled together along highly contended pathways connecting distant parts could result not only longer round trip durations but also unpredictable variations making precise predictions challenging without empirical studies beforehand. #### Optimizing Inter-Core Communications Optimization techniques focus mainly around minimizing bottlenecks caused due to above mentioned reasons plus others including software aspects too: - **Affinity Settings Adjustment**: Pinning threads/processes closer physically reduces chances encountering cross socket traversals thereby lowering overall jitter experienced throughout runtime operations. - **Software Prefetching Instructions Usage**: Explicitly instructing preloading certain chunks ahead expected usage points helps mask some fetch costs otherwise paid later synchronously at critical moments impacting negatively perceived responsiveness metrics reported back end users interacting directly against affected services running atop underlying hardware platforms. - **Algorithmic Improvements Focused On Reducing Dependencies Between Threads Or Processes Running Across Separate Units Within Same Chip Package**: Design algorithms so that dependencies are minimized thus reducing synchronization primitives required maintaining orderings preserving correctness properties desired application domains demand strictly adhered regardless implementation specifics chosen ultimately deployed production environments.
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UnknownBody

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值