将文本变成词云

最新推荐文章于 2026-01-07 13:36:22 发布

原创最新推荐文章于 2026-01-07 13:36:22 发布 · 159 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#python #开发语言

python 专栏收录该内容

4 篇文章

订阅专栏

文章讲述了使用Python的jieba库进行文本分词，并结合cv2库尝试生成心形词云的过程。然而，最终结果未能呈现预期的心形，可能是由于词的数量过多。同时，文章提到了在词频统计时，英文词可能会有更高的优先级。

1将上次获得的评论整理成一个词云，借助了cv2想生成某心形状的词云。

import numpy as np
import pandas as pd
import jieba
import jieba.posseg
import cv2

data_source= pd.DataFrame(pd.read_excel(r'D:\tem\001\comments_lots.xlsx',sheet_name='comments_doubian_真心半解'))
text_half_front = data_source.iloc[0:40,1]#获取第一列中0到40行的元素
text_half_next = data_source.iloc[40:80,1]
comments_half_front = " ".join(text_half_front)#利用join将列表转换为字符串
comments_half_next = " ".join(text_half_next)

word_half_front = jieba.posseg.cut(comments_half_front)#利用jieba.posseg.cut函数进行分词，并得到每个词对应的词性
word_half_next = jieba.posseg.cut(comments_half_next)
word_front = list()#列表容纳词性
word_next = list()
for item in word_half_front:
    word_front.append(item)
for item in word_half_next:
    word_next.append(item)
    
word_front_dict = dict(word_front)#将刚存储的词和词性转换为字典
word_next_dict = dict(word_next)
word1_front_lit = list()
word2_next_lit = list()
word_remove = ['ud','ug','uj','ul','uv','uz','r','rr','x',' ']
for key,value in word_front_dict.items():#除去词性为标点符号，的，得等等的词，并再转换为列表
    if value not in word_remove:
        word1_front_lit.append(key)
        
for key,value in word_next_dict.items():
    if value not in word_remove:
        word2_next_lit.append(key)
ciyun_front = " ".join(word1_front_lit)
ciyun_next = " ".join(word2_next_lit)

mask = cv2.imread('D:\Builder_tool\Python\heart1.jpeg')
wc = WordCloud(background_color="white",\
              width = 2000,\
              height=1400,\
              max_words=1000,\
              max_font_size=100,\
              mask=mask,\
              font_path='simfang.ttf',
              #设置为中文
              ).generate(ciyun_next)
wc.to_file('D:\Builder_tool\Python\ciyun_next_digital.jpg')