【python--比对两个列表获取列表中出现频率最高的词及频率】

最新推荐文章于 2025-05-20 18:44:14 发布

码上有前

最新推荐文章于 2025-05-20 18:44:14 发布

阅读量753

点赞数 21

分类专栏： Python 文章标签： python linux 开发语言

本文链接：https://blog.youkuaiyun.com/qq_45832651/article/details/136513175

版权

Python 专栏收录该内容

37 篇文章

订阅专栏

本文介绍了如何使用Python编写一个函数calculate_probability，用于计算给定关键词在另一列表中的出现概率。通过Counter和列表操作，找出两个关键词列表中概率最高的三个关键词组合。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

🚀 作者：“码上有前”
🚀 文章简介：Python
🚀 欢迎小伙伴们点赞👍、收藏⭐、留言💬

在这里插入图片描述

python练习题

完整代码

完整代码

from collections import Counter
from data_keywords import extract_keywords, extract_keywords_from_json

def calculate_probability(list1, list2):
    count_dict = {value1: round(sum(value1 in value2 for value2 in list2) / len(list2), 2) for value1 in list1}
    sorted_dict = dict(sorted(count_dict.items(), key=lambda x: x[1], reverse=True))
    top_three = list(sorted_dict.items())[:3]
    return top_three

# 假设这是给定的关键词列表
given_keywords = ['自营', '赠', '满赠','京东物流','免邮','2免1','2件7.5折','跨店每满','券']
category_given_keywords = ['自营', '赠', '满赠','京东物流','免邮','2免1','2件7.5折','跨店每满','券',"包税","官方立减15%"]
folder_path = './Cosmetic_data/Brand_Classification/brand&details_analysis'
categories_path = "./Cosmetic_data/Makeup_Classification/pcommit&details_analysis"
keyword_column = '关键词'  

new_keyword_list = extract_keywords(folder_path, keyword_column)
categories_keywords_list = extract_keywords_from_json(categories_path, keyword_column)

result = calculate_probability(given_keywords, new_keyword_list)
# print("最高的三个关键词和其概率:", result)
calculate_result = calculate_probability(category_given_keywords, categories_keywords_list)
# print("最高的三个关键词和其概率:", calculate_result)