【论文解读】Mining Dual Emotion for Fake News Detection

最新推荐文章于 2024-03-27 16:46:41 发布

o(*￣︶￣*)o__小肉松

最新推荐文章于 2024-03-27 16:46:41 发布

阅读量1.6k

点赞数 1

CC 4.0 BY-SA版权

分类专栏：自然语言处理深度学习文章标签：人工智能自然语言处理

本文链接：https://blog.youkuaiyun.com/made_in_china_too/article/details/117897454

深度学习同时被 2 个专栏收录

18 篇文章

订阅专栏

自然语言处理

3 篇文章

订阅专栏

本文探讨了一种新型模型，该模型在虚假新闻检测任务中结合了出版商情绪(PublisherEmotion)和社交情绪(SocialEmotion)。通过情感分析，从新闻文本和评论中提取情绪特征，并利用BiGRU或BERT作为基础分类器。实验结果显示，这种双情感特征的加入显著提升了分类性能。模型包括情绪词典得分、情感强度、情感极性等特征，并通过平均池化和最大池化处理评论情绪，形成情绪差距(EmotionGap)。最后，这些特征被用于增强分类器的输入，以提高预测准确性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

本文主要内容：

本文主要是对以下论文进行解读
《Xueyao Zhang, Juan Cao, Xirong Li, Qiang Sheng, Lei Zhong, and Kai Shu. 2021. Mining Dual Emotion for Fake News Detection. In Proceedings of the Web Conference 2021 (WWW '21). Association for Computing Machinery, New York, NY, USA, 3465–3476. DOI:https://doi.org/10.1145/3442381.3450004》
论文作者提供的代码
【https://github.com/RMSnow/WWW2021】
论文模型的主要任务
给定一则新闻 $T$ 、该新闻是真实或虚假新闻的标签 $y$ 、该则新闻的若干条评论 $M=[M_1,M_2,...,M_i,...,M_{L_M}]$ 。现在需要设计出一个模型，基于 $T$ 、 $M$ 来预测出 $y^\hat{y}$ ，使得 $y$ 和 $y^\hat{y}$ 一致。

模型整体思路

目前的模型大多是仅关注新闻文本所体现的Publisher Emotion，而忽略了该则新闻的评论所体现的Social Emotion。本文分别对新闻文本和新闻评论，各提取出情感特征。并将这两种特征同时添加入以往的fake news detector分类器（BI-GRU、BERT）中。实验结果表明，添加了这两种特征，能显著提升分类效果。

Publisher Emotion

给定含有 $L$ 个单词的文本 $T=[ t_1, t_2,...,t_i,...,t_L ]$ ，及一个情感分类器 $f ()$ ，

得到分类结果的概率为 $emoTcate∈Rdfemo_T^{cate} \in R^{d_f}$ ，其中 $d_f$ 是情感的种类数量：

$emoTcate=f(T)(1)emo_T^{cate}=f(T) \tag{1}$

假设Emotional Lexicon里共有 $d_e$ 种情感，记 $E=\{ e_1,e_2,...,e_{d_e} \}$ 。针对每种情感都有一个字典 $ϵe={we,1,we,2,...,we,Le}\epsilon_e=\{ w_{e,1},w_{e,2},...,w_{e,L_e} \}$ 。

使用每种情感的字典 $ϵe\epsilon_e$ ，去遍历文本 $T$ 中的每个单词 $t_i$ ，可以得到文本 $T$ 中各单词在每种情感下的得分：

$\begin{cases} e_1:[s(t_1,e_1),s(t_2,e_1),...,s(t_L,e_1)] \\ e_2:[s(t_1,e_2),s(t_2,e_2),...,s(t_L,e_2)] \\ \cdot \\ \cdot \\ \cdot \\ e_{d_e}:[s(t_1,e_{d_e}),s(t_2,e_{d_e}),...,s(t_L,e_{d_e})] \end{cases} \tag{2}$

在一个情感 $e_i$ 下，对于文本 $T$ 中各个单词 $t_i$ 的得分相加，得到文本 $T$ 在不同情感 $e_i$ 下的得分 $s(T,e_i)$ ：
$\begin{cases} e_1:s(T,e_1)\\ e_2:s(T,e_2)\\ \cdot \\ \cdot \\ \cdot \\ e_{d_e}:s(T,e_{d_e}) \end{cases} \tag{3}$

将文本 $T$ 在不同情感 $e_i$ 下的得分 $s(T,e_i)$ 拼接成一个 $d_e$ 维的向量，其包含了文本 $T$ 在不同情感下的倾向信息：

$emo_T^{lex}=s(T,e_1) \oplus s(T,e_2) \oplus \cdots \oplus s(T,e_{d_e})$

例如在考虑情感happy时，单词esctatic比单词joyful更强烈，因此需要考虑单词的情感强度 $int(t_i)$ ，其可以通过情感字典计算得到。intensity-aware text-level scores为：
$s'(T,e)=\sum\limits_{i=1}^{L} s'(t_i,e) = \sum\limits_{i=1}^{L} int(t_i)s(t_i,e), \forall e \in E$

将文本 $T$ 在 $d_e$ 种情感下的intensity-aware text-level scores进行拼接，得到情感强度特征 $emotint∈Rdeemo_t^{int} \in R^{d_e}$ ：
$emo_t^{int}=s'(T,e_1) \oplus s'(T,e_2) \oplus \cdots \oplus s'(T,e_{d_e})$

除了上文提到的emotion-level特征，此处根据sentiment dictionary或public toolkits可以得到coarse-grained的情感得分，即非负则正的情感极性 $emoTsenti∈Rdsemo_T^{senti} \in R^{d_s}$ ，通常 $d_s=1$

另外，情感符号、标点符号、大写字母等也能表达情绪。假设共有 $d_a$ 种特征，本文根据下表可以得到额外的辅助特征 $emoTaux∈Rdaemo_T^{aux} \in R^{d_a}$ :

在这里插入图片描述

最终，文本 $T$ 的Publisher Emotion为 $emoT∈Rdf+2de+ds+daemo_T \in R^{df+2d_e+d_s+d_a}$ ：

$emo_T=emo_T^{cate} \oplus emo_T^{lex} \oplus emo_T^{int} \oplus emo_T^{senti} \oplus emo_T^{aux}$

Social Emotion

假设一则新闻共有 $L_M$ 条评论， $M=[M_1,M_2,...,M_i,...,M_{L_M}]$ ，则根据上文的计算过程，将这 $L_M$ 条评论的特征汇聚，得到social emotion特征 $emoM‾∈RLM×d\overline{emo_M} \in R^{L_M \times d}$ ：

$\overline{emo_M} = emo^T_{M_1} \oplus emo^T_{M_2} \oplus \cdots \oplus emo^T_{M_{L_M}}$

使用Mean pooling计算情感的平均信号 $emoMmean∈Rdemo_M^{mean} \in R^d$ ，使用max pooling计算情感的极端信号 $emoMmax∈Rdemo_M^{max} \in R^d$ ：

$emo_M^{mean} = mean(\overline{emo_M}) emo_M^{max} = max(\overline{emo_M})$

最后Social Emotion表达为 $emoM∈R2demo_M \in R^{2d}$ ：
$emo_M = emo_M^{mean} \oplus emo_M^{max}$

Emotion Gap

为了特显Publisher Emotion和Social Emotion的不同之处，计算特征 $emoMmean∈R2demo_M^{mean} \in R^{2d}$ ：
$emo^{gap} = (emo_T-emo_M^{mean}) \oplus (emo_T-emo_M^{max})$

Dual Emotion Features

$emo^{dual} = emo_T \oplus emo_M \oplus emo^{gap}$

Fake News Detection

假设现在BiGRU是虚假新闻分类器，则 $BiBUR_T$ 是其输出的特征，现在将 $BiBUR_T$ 与 $emo^{dual}$ 进行拼接，再放到MLP、Softmax中进行最终分类，得到新闻的虚实预测结果 $y^\hat{y}$ :
$y^=Softmax(MLP([BiGRUT,emodual])) \hat{y} = Softmax(MLP([BiGRU_T,emo^{dual}]))$