A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Mode

828 篇文章

已下架不支持订阅

本文介绍CIF-Bench,一个针对大型语言模型(LLM)在中文任务中零样本泛化能力的评估基准。该基准由150个任务和15000个输入输出对组成,测试20个类别,旨在揭示LLM在中文和复杂推理任务中的局限性。实验结果显示最佳模型得分仅52.9%,表明LLM在不熟悉语言环境中的挑战。CIF-Bench旨在推动更适应性、文化敏感和语言多样性的LLM发展。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

本文是LLM系列文章,针对《CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models》的翻译。

CIF基准:评估大型语言模型泛化能力的汉语指令遵循基准

摘要

大型语言模型(LLM)的进步通过以下说明增强了在各种看不见的自然语言处理(NLP)任务中进行泛化的能力。然而,在中文等资源匮乏的语言中,它们的有效性往往会降低,数据泄露带来的偏见评估加剧了这种情况,使人们怀疑它们在新的语言领域的真实可推广性。作为回应,我们介绍了汉语指令跟随基准(CIF-Bench),旨在评估LLM对汉语的零样本可推广性。CIF Bench由150个任务和15000个输入输出对组成,由母语人士开发,用于测试20个类别的复杂推理和中国文化的细微差别。为了减轻评估偏差,我们只公开发布了一半的数据集,其余的数据集保密,并引入了多样化的指令来最大限度地减少得分差异,共有45000个数据实例。我们对28个选定的LLM的评估显示出明显的性能差距,最佳模型的得分仅为52.9%,这突出了LLM在不太熟悉的语言和任务环境中的局限性。这项工作旨在揭示LLM在处理中文任务方面的当前局限性,利用发布的数据和基准,推动开发更具文化信息和语言多样性的模型。

The Cortex-M0 processor does not have a hardware divider, which means that division calculations are performed using software routines. There are various algorithms for performing software division, but one commonly used method is called "long division". In long division, the divisor is repeatedly subtracted from the dividend until the remainder is less than the divisor. The number of times the divisor is subtracted is the quotient, and the remainder is the final result. This process is repeated until all digits of the dividend have been processed. Here is a sample code for performing integer division on Cortex-M0 using long division: ``` int divide(int dividend, int divisor) { int quotient = 0, remainder = 0; int sign = ((dividend < 0) ^ (divisor < 0)) ? -1 : 1; // convert both operands to positive if (dividend < 0) dividend = -dividend; if (divisor < 0) divisor = -divisor; // perform long division for (int i = 31; i >= 0; i--) { remainder <<= 1; // left shift remainder remainder |= (dividend >> i) & 1; // add next bit from dividend to remainder if (remainder >= divisor) { remainder -= divisor; quotient |= (1 << i); // set corresponding bit in quotient } } // apply sign quotient = sign * quotient; return quotient; } ``` Note that this code assumes that both the dividend and divisor are 32-bit integers. It also handles negative operands correctly and applies the correct sign to the result. However, it may not be the most efficient implementation and may need to be optimized for specific use cases.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UnknownBody

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值