SnowNLP实现情感分析（今日头条用户评论为数据源）

最新推荐文章于 2025-03-10 23:03:10 发布

JayDen2001

最新推荐文章于 2025-03-10 23:03:10 发布

阅读量8.7k

点赞数 9

文章标签： python 自然语言处理情感分析数据可视化

本文链接：https://blog.youkuaiyun.com/qq_66183206/article/details/123058233

版权

SnowNLP是一个python写的类库，可以方便的处理中文文本内容，是受到了TextBlob的启发而写的，由于现在大部分的自然语言处理库基本都是针对英文的，于是写了一个方便处理中文的类库，并且和TextBlob不同的是，这里没有用NLTK，所有的算法都是自己实现的，并且自带了一些训练好的字典。注意本程序都是处理的unicode编码，所以使用时请自行decode成unicode。

from snownlp import SnowNLP
import pandas as pd
import os
print('评论情感分析中.....')
filePath = 'C:/Users/JayDen/Desktop/预处理评论数据/'
filelist=os.listdir(filePath)
for t in filelist:
    txt = open('C:/Users/JayDen/Desktop/预处理评论数据/'+t, 'r',encoding="utf-8") #加载要处理的文件的路径
    text = txt.readlines()
#确认读取文件成功，并关闭文件节省资源
    txt.close()
#遍历每一条评论，得到每条评论是positive文本的概率，每条评论计算完成后输出ok确认执行成功
    comments = []
    comments_score = []
    for i in (text[0].split()):
        a1 = SnowNLP(i)
        a2 = a1.sentiments
        comments.append(i)
        comments_score.append(a2)
#将结果数据框存为.xlsx表格，查看结果及分布
    table = pd.DataFrame(comments, comments_score)
    table.to_excel('C:/Users/JayDen/Desktop/情感分析数据/'+t[0:-4]+'评论情感分析.xls', sheet_name='result')
    print('情感分析成功！')

这是我之前所爬取的今日头条关于某个台风的相关评论

import matplotlib
import  pandas  as pd
from pandas import DataFrame
import xlrd
import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
from pyecharts import options as opts
from pyecharts.charts import Pie
from palettable.colorbrewer.qualitative import Pastel1_7
import os
plt.rcParams['font.sans-serif'] = 'simhei'
plt.rcParams['axes.unicode_minus']=False
filePath = 'C:/Users/JayDen/Desktop/情感分析数据/'
filelist=os.listdir(filePath)
for t in filelist:
    data = xlrd.open_workbook('C:/Users/JayDen/Desktop/情感分析数据/'+t)# 打开Excel文件
    table = data.sheets()[0]
    x = table.col_values(0,1)
    正向评论_positive=0
    负向评论_negative=0
    中性评论_neutral=0
    for i in range(0,len(x)):
        if x[i]>=0.45 and x[i]<=0.55:
            中性评论_neutral+=1
        if x[i]>0.55:
            正向评论_positive+=1
        if x[i]<0.45 :
            负向评论_negative+=1
#print(正向评论_positive)
#print(负向评论_negative)
#print(中性评论_neutral)
    x = ['正向评论_positive', '负向评论_negative','中性评论_neutral']
    y = [正向评论_positive,负向评论_negative,中性评论_neutral]
    plt.pie(y,pctdistance=0.85,autopct='%.1f%%', labels=x, colors=Pastel1_7.hex_colors, wedgeprops=dict(width=0.3, edgecolor='w'))
    plt.legend(x,loc='upper left')
    plt.title(t[0:-10]+'--情感分类环形图')
    plt.savefig('C:/Users/JayDen/Desktop/数据可视化图集/评论情感分析环形图/'+t[0:-10]+'--情感分类环形图')
    plt.show()

将之前情感分析所计算出的得分按照相对应的区间分类，分为正向、负向、中性三类评论，最后可视化出环形图