Facilitating Pornographic Text Detection for Open Domain Dialogue Systems via Knowledge Distillation

本文介绍了CENSORCHAT数据集,用于检测开放域对话中的色情内容。通过大型语言模型的知识蒸馏注释数据,并用ChatGPT和GPT-4进行标签校准,创建可靠的文本分类器。此方法经济高效,增强了检测器的准确性和可靠性。

本文是LLM系列文章,针对《Facilitating Pornographic Text Detection for Open-Domain Dialogue Systems via Knowledge Distillation of Large Language Models》的翻译。

通过大语言模型的知识蒸馏促进开放域对话系统的色情文本检测

摘要

在开放域对话系统中,人机交互对话中出现的色情内容会给用户带来严重的副作用。然而,在人机交互对话中检测色情语言是一个研究较少的重要课题。为了朝着这个方向前进,我们引入了CENSORCHAT,这是一个对话监测数据集,旨在检测对话会话是否包含色情内容。为此,我们收集了野外真实的人机交互对话,并将其分解为单句话和单回合对话,最后一句话由聊天机器人说出。我们建议利用大型语言模型的知识蒸馏来注释数据集。具体来说,首先,原始数据集由四个开源大型语言模型进行注释,多数票决定标签。其次,我们使用ChatGPT来更新第一步中的空标签。第三,为了确保验证和测试集的质量,我们使用GPT-4进行标签校准。如果当前标签与GPT-4生成的标签不匹配,我们将使用自我批评策略来验证其正确性。最后,为了便于检测色情文本,我们使用伪标记数据集开发了一系列文本分类器。详细的数据分析表明,将知识蒸馏技术与大型语言模型相结合,为开发色情文本检测器提供了一种实用且经济高效的方法。

1 引言

2 相关工作

3 数据收集

4 方法

5 实验

<

AG's News Topic Classification Dataset Version 3, Updated 09/09/2015 ORIGIN AG is a collection of more than 1 million news articles. News articles have been gathered from more than 2000 news sources by ComeToMyHead in more than 1 year of activity. ComeToMyHead is an academic news search engine which has been running since July, 2004. The dataset is provided by the academic comunity for research purposes in data mining (clustering, classification, etc), information retrieval (ranking, search, etc), xml, data compression, data streaming, and any other non-commercial activity. For more information, please refer to the link http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html . The AG's news topic classification dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the dataset above. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015). DESCRIPTION The AG's news topic classification dataset is constructed by choosing 4 largest classes from the original corpus. Each class contains 30,000 training samples and 1,900 testing samples. The total number of training samples is 120,000 and testing 7,600. The file classes.txt contains a list of classes corresponding to each label. The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 3 columns in them, corresponding to class index (1 to 4), title and description. The title and description are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is "\n".
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UnknownBody

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值