css word-wrap_CSS中分词“ break-all”和“ break-word”的值之间的差异

本文深入探讨CSS中的word-break属性,对比分析'break-all'与'break-word'两个关键值的区别,通过实例演示如何有效避免文本溢出,为Web开发者提供全面的指导。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

css word-wrap

Definition:

定义:

What is the most fundamental element that comes to mind when you are considering to develop a web page? Words! If that was your answer, then pat yourself because you are already aware of what we are going to discuss in this article. The discussion here will revolve around the most basic aspect of web development that is "dealing with words". But what are we going to discuss about the words here? That property is known as word-break in CSS but we won't be discussing the word-break property rather the focus here is on the word-break property's values and which are Break-all and Break-word.

在考虑开发网页时,想到的最基本要素是什么? 话! 如果是您的答案,请轻拍一下,因为您已经知道我们将在本文中讨论的内容。 这里的讨论将围绕Web开发的最基本方面,即“用词处理”。 但是,我们将在这里讨论这些单词呢? 该属性在CSS中被称为断字,但我们将不讨论断字属性,而是将重点放在断字属性的值上,即Break-all和Break-word。

In this article, what we are trying to do is give web developers a better understanding of the Word-Break property based on its values. Learning about the property and the values that property upholds is equally important. As they say, incomplete knowledge is worse than no knowledge at all, therefore learning about the values of the word-break property is of the essence. Therefore, we will see the difference between the two values of the word-break property is CSS which are break-all and break-word.

在本文中,我们试图做的是让Web开发人员根据其值更好地理解Word-Break属性。 了解财产和财产所坚持的价值同样重要。 正如他们所说,不完整的知识总比没有知识差,因此,了解分词属性的值至关重要。 因此,我们将看到word-break属性的两个值之间的区别是CSS,即break-all和break-word。

全力以赴 (break-all)

The first and foremost value of the word-break property is the break-all value. This value as the name suggests helps in breaking words at any character and that again prevents overflow. Therefore, the sole purpose of this value as you would have already figured out is to break the component of a word so that overflow can be prevented. Not that tough to understand right? Well, implementation is also easy, go ahead and have a look at the example below.

单词中断属性的第一个也是最重要的值是全部中断值。 顾名思义,该值有助于破坏任何字符的单词,并再次防止溢出。 因此,您已经知道,此值的唯一目的是破坏单词的组成部分,从而可以防止溢出。 不难理解吗? 好吧,实现也很容易,请继续看下面的示例。

Syntax:

句法:

    element{
        Word-break:break-all;
    }

断词 (break-word)

The last but not the least, the break-word value is used to prevent overflow but here word may be broken at arbitrary points. Again, the break-word value has similar characteristics but here the word can be broken at arbitrary points or simply put at any random point. This value is also not that tough to understand, so why don’t we move ahead and have a look at an equally easy example.

最后但并非最不重要的一点是, 中断字值用于防止溢出,但此处的字可在任意点处中断。 再次,断词值具有相似的特征,但是这里的词可以在任意点处折断或简单地放在任意随机点处。 这个值也不难理解,所以我们为什么不继续前进,看看同样简单的例子。

Syntax:

句法:

    element{
        word-break: break-word;
    }

In the below example, you can see the difference between both the values of the property.

在下面的示例中,您可以看到两个属性值之间的差异。

Example:

例:

<!DOCTYPE html>

<html>

<head>
    <style>
        .break-word {
            width: 180px;
            border: 1px solid #006969;
            word-break: break-word;
        }
        
        .break-all {
            width: 180px;
            border: 1px solid #006969;
            word-break: break-all;
        }
    </style>
</head>

<body>
    <h1>The word-break Property</h1>
    <h2> break-word</h2>
    <p class="break-word">This is IncludeHelp.ThisisIncludeHelp.This is IncludeHelp.This is IncludeHelp.</p>
    <h2> break-all</h2>
    <p class="break-all">This is IncludeHelp.ThisisIncludeHelp.This is IncludeHelp.This is IncludeHelp.</p>
</body>

</html>

word break vs break all in CSS example

Conclusion:

结论:

Now, as you can see the outputs of these two values you might have gotten the gist about the differences between these two values and their behavior. Both the values appear to be the same but are very different when implemented which could be seen in the outputs above. So go ahead and try it out yourself. And if you face any difficulties, we are always available to help you at https://ask.includehelp.com/.

现在,您可以看到这两个值的输出,您可能已经了解了这两个值之间的差异及其行为。 这两个值看起来是相同的,但在实现时却非常不同,可以在上面的输出中看到。 因此,继续尝试一下。 并且,如果您遇到任何困难,我们随时可以在https://ask.includehelp.com/上为您提供帮助。

翻译自: https://www.includehelp.com/code-snippets/difference-between-values-of-word-break-break-all-versus-break-word-in-css.aspx

css word-wrap

import os import re import tkinter as tk from tkinter import ttk, filedialog, messagebox, simpledialog import jieba import math import requests import threading from collections import defaultdict, Counter from bs4 import BeautifulSoup from urllib.parse import quote from tkinter.scrolledtext import ScrolledText class InformationRetrievalSystem: def __init__(self, root): self.root = root self.root.title("智能信息检索系统 v3.0") self.root.geometry("1200x800") # 初始化系统组件 self.documents = {} self.vocabulary = set() self.inverted_index = defaultdict(lambda: {'doc_ids': {}, 'idf': 0}) self.stop_words = self.load_stop_words() self.doc_vectors = {} self.synonym_dict = self.load_synonym_dict() # 同义词词典 self.domain_categories = self.load_domain_categories() # 领域分类 # 线程安全的文档计数器 self.doc_counter = 0 self.doc_counter_lock = threading.Lock() # 界面布局 self.create_ui() self.create_status_bar() def create_ui(self): """创建主界面""" main_frame = ttk.Frame(self.root) main_frame.pack(fill=tk.BOTH, expand=True, padx=10, pady=10) # 控制面板 control_frame = ttk.LabelFrame(main_frame, text="控制面板") control_frame.pack(fill=tk.X, pady=5) # 搜索组件 self.search_entry = ttk.Entry(control_frame, width=50) self.search_entry.pack(side=tk.LEFT, padx=5) # 搜索类型选择 search_type_frame = ttk.Frame(control_frame) search_type_frame.pack(side=tk.LEFT, padx=5) self.search_type = tk.StringVar(value="关键词") ttk.Radiobutton(search_type_frame, text="关键词", variable=self.search_type, value="关键词").pack(side=tk.LEFT) ttk.Radiobutton(search_type_frame, text="布尔", variable=self.search_type, value="布尔").pack(side=tk.LEFT) ttk.Radiobutton(search_type_frame, text="短语", variable=self.search_type, value="短语").pack(side=tk.LEFT) ttk.Radiobutton(search_type_frame, text="扩展", variable=self.search_type, value="扩展").pack(side=tk.LEFT) ttk.Button(control_frame, text="搜索", command=self.search).pack(side=tk.LEFT, padx=5) # 文档管理组件 doc_manage_frame = ttk.Frame(control_frame) doc_manage_frame.pack(side=tk.LEFT, padx=20) ttk.Button(doc_manage_frame, text="加载本地文档", command=self.load_local_documents).pack(side=tk.LEFT) ttk.Button(doc_manage_frame, text="输入URL添加", command=self.add_url_document).pack(side=tk.LEFT, padx=5) # 自动搜索组件 auto_search_frame = ttk.Frame(control_frame) auto_search_frame.pack(side=tk.LEFT) self.keyword_entry = ttk.Entry(auto_search_frame, width=20) self.keyword_entry.pack(side=tk.LEFT) self.keyword_entry.insert(0, "人工智能") ttk.Button(auto_search_frame, text="自动网络搜索", command=self.auto_web_search).pack(side=tk.LEFT, padx=5) # 结果显示区域 result_frame = ttk.LabelFrame(main_frame, text="搜索结果") result_frame.pack(fill=tk.BOTH, expand=True) columns = ('doc_id', 'source', 'score', 'content') self.result_tree = ttk.Treeview( result_frame, columns=columns, show='headings', selectmode='browse' ) # 配置列 self.result_tree.heading('doc_id', text='文档ID', anchor=tk.W) self.result_tree.heading('source', text='来源', anchor=tk.W) self.result_tree.heading('score', text='相关度', anchor=tk.CENTER) self.result_tree.heading('content', text='内容摘要', anchor=tk.W) self.result_tree.column('doc_id', width=150, minwidth=100) self.result_tree.column('source', width=100, minwidth=80) self.result_tree.column('score', width=80, minwidth=60, anchor=tk.CENTER) self.result_tree.column('content', width=800, minwidth=400) vsb = ttk.Scrollbar(result_frame, orient="vertical", command=self.result_tree.yview) hsb = ttk.Scrollbar(result_frame, orient="horizontal", command=self.result_tree.xview) self.result_tree.configure(yscrollcommand=vsb.set, xscrollcommand=hsb.set) self.result_tree.grid(row=0, column=0, sticky=tk.NSEW) vsb.grid(row=0, column=1, sticky=tk.NS) hsb.grid(row=1, column=0, sticky=tk.EW) result_frame.grid_rowconfigure(0, weight=1) result_frame.grid_columnconfigure(0, weight=1) # 详情面板 detail_frame = ttk.LabelFrame(main_frame, text="文档详情") detail_frame.pack(fill=tk.BOTH, expand=False, pady=5) self.detail_text = ScrolledText(detail_frame, wrap=tk.WORD, height=8) self.detail_text.pack(fill=tk.BOTH, expand=True) # 绑定事件 self.result_tree.bind('<<TreeviewSelect>>', self.show_detail) def create_status_bar(self): """创建状态栏""" self.status_var = tk.StringVar() status_bar = ttk.Label(self.root, textvariable=self.status_var, relief=tk.SUNKEN) status_bar.pack(side=tk.BOTTOM, fill=tk.X) def update_status(self, message): """更新状态栏""" self.status_var.set(message) self.root.update_idletasks() def load_stop_words(self): """加载停用词表""" stop_words = set() try: with open('stopwords.txt', 'r', encoding='utf-8') as f: stop_words = set(line.strip() for line in f if line.strip()) except FileNotFoundError: messagebox.showwarning("警告", "未找到停用词文件stopwords.txt") return stop_words def load_synonym_dict(self): """加载同义词词典""" synonyms = defaultdict(dict) # 改为存储术语-同义词-权重的字典 try: with open('synonyms.txt', 'r', encoding='utf-8') as f: for line in f: if line.strip(): parts = line.strip().split(',') if len(parts) > 1: term = parts[0] for syn_with_weight in parts[1:]: if ':' in syn_with_weight: syn, weight = syn_with_weight.split(':') synonyms[term][syn] = float(weight) else: synonyms[term][syn_with_weight] = 0.7 # 默认权重 except FileNotFoundError: messagebox.showwarning("警告", "未找到同义词文件synonyms.txt") return synonyms def load_domain_categories(self): """加载领域分类词典""" categories = defaultdict(list) try: with open('domain_categories.txt', 'r', encoding='utf-8') as f: for line in f: if line.strip(): parts = line.strip().split(':') if len(parts) == 2: domain, terms = parts categories[domain] = terms.split(',') except FileNotFoundError: messagebox.showwarning("警告", "未找到领域分类文件domain_categories.txt") return categories # 文档管理功能 def load_local_documents(self): """加载本地文档""" files = filedialog.askopenfilenames(filetypes=[("Text files", "*.txt")]) if not files: return for file in files: try: with open(file, 'r', encoding='utf-8') as f: content = f.read() doc_id = os.path.basename(file) self.add_document(doc_id, content, '本地文件') except Exception as e: messagebox.showerror("错误", f"加载文件失败:{str(e)}") self.update_index() messagebox.showinfo("成功", f"已加载 {len(files)} 个本地文档") def add_url_document(self): """手动添加URL文档""" url = simpledialog.askstring("输入URL", "请输入网页地址:") if url: threading.Thread(target=self.fetch_web_content, args=(url, '手动添加'), daemon=True).start() def auto_web_search(self): """自动网络搜索文档""" keyword = self.keyword_entry.get().strip() if not keyword: messagebox.showwarning("警告", "请输入搜索关键词") return threading.Thread(target=self.baidu_search, args=(keyword,), daemon=True).start() # 网络请求功能 def fetch_web_content(self, url, source): """获取网页内容""" try: headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36' } # 获取真实URL with requests.get(url, headers=headers, timeout=10, allow_redirects=True) as response: response.raise_for_status() final_url = response.url # 获取页面内容 with requests.get(final_url, headers=headers, timeout=15) as response: response.raise_for_status() soup = BeautifulSoup(response.content, 'html.parser') # 提取正文(简单实现) text = soup.get_text() text = re.sub(r'\s+', ' ', text).strip() # 生成唯一文档ID with self.doc_counter_lock: self.doc_counter += 1 doc_id = f"{source}_{self.doc_counter}" self.add_document(doc_id, text, final_url) self.root.after(0, lambda: messagebox.showinfo("成功", f"已添加文档:{doc_id}")) except Exception as e: self.root.after(0, lambda: messagebox.showerror("错误", f"抓取失败:{str(e)}")) def baidu_search(self, keyword): """执行百度搜索""" try: self.update_status(f"正在搜索: {keyword}") search_url = f"https://www.baidu.com/s?wd={quote(keyword)}" headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36' } with requests.get(search_url, headers=headers, timeout=10) as response: response.raise_for_status() soup = BeautifulSoup(response.text, 'html.parser') # 修复:使用新的CSS选择器来定位搜索结果 results = soup.select('div.c-container') max_results = min(5, len(results)) for idx, result in enumerate(results[:max_results]): # 修复:提取正确的链接 if link := result.select_one('h3.t a'): href = link.get('href', '') if href.startswith('http'): # 确保是有效的URL # 错开线程启动时间,避免请求过于密集 threading.Timer( idx * 1, # 每个线程间隔1秒启动 self.fetch_web_content, args=(href, f"百度结果_{idx+1}") ).start() self.update_status(f"搜索完成: {keyword}") except Exception as e: self.update_status(f"搜索失败: {str(e)}") self.root.after(0, lambda: messagebox.showerror("错误", f"搜索失败:{str(e)}")) def add_document(self, doc_id, content, source): """添加文档到系统""" if doc_id in self.documents: return self.documents[doc_id] = self.preprocess_text(content) self.update_index() # 更新UI显示 def update_treeview(): self.result_tree.insert('', 'end', values=( doc_id, source, 'N/A', ' '.join(self.documents[doc_id][:30]) + '...' )) self.root.after(0, update_treeview) def preprocess_text(self, text): """文本预处理""" words = jieba.lcut(text) return [word for word in words if word not in self.stop_words and re.match(r'^[\u4e00-\u9fa5]+$', word)] def update_index(self): """更新索引系统""" self.build_inverted_index() self.calculate_tf_idf() def build_inverted_index(self): """构建倒排索引""" self.vocabulary = set() self.inverted_index = defaultdict(lambda: {'doc_ids': {}, 'idf': 0}) # 收集词项位置 for doc_id, words in self.documents.items(): term_positions = defaultdict(list) for pos, term in enumerate(words): term_positions[term].append(pos) self.vocabulary.add(term) for term, positions in term_positions.items(): self.inverted_index[term]['doc_ids'][doc_id] = positions # 计算IDF total_docs = len(self.documents) for term in self.inverted_index: doc_count = len(self.inverted_index[term]['doc_ids']) self.inverted_index[term]['idf'] = math.log((total_docs + 1) / (doc_count + 1e-9)) def calculate_tf_idf(self): """计算TF-IDF向量""" self.doc_vectors = {} for doc_id, words in self.documents.items(): tf = defaultdict(int) for word in words: tf[word] += 1 vector = {} for term, count in tf.items(): tf_val = count / len(words) idf_val = self.inverted_index[term]['idf'] vector[term] = tf_val * idf_val self.doc_vectors[doc_id] = vector # 搜索功能 def search(self): """根据不同搜索类型执行搜索""" query = self.search_entry.get().strip() search_type = self.search_type.get() if not query: messagebox.showwarning("警告", "请输入查询内容") return # 清空结果 for item in self.result_tree.get_children(): self.result_tree.delete(item) if search_type == "关键词": self.perform_keyword_search(query) elif search_type == "布尔": self.perform_boolean_search(query) elif search_type == "短语": self.perform_phrase_search(query) elif search_type == "扩展": self.perform_expanded_search(query) def perform_keyword_search(self, query): """执行关键词搜索""" query_terms = self.preprocess_text(query) if not query_terms: messagebox.showwarning("警告", "查询词无效或均为停用词") return # 构建查询向量 query_vector = defaultdict(float) term_counts = defaultdict(int) for term in query_terms: term_counts[term] += 1 for term in term_counts: if term in self.inverted_index: tf = term_counts[term] / len(query_terms) idf = self.inverted_index[term]['idf'] query_vector[term] = tf * idf # 计算相似度 scores = {} query_norm = math.sqrt(sum(v**2 for v in query_vector.values())) for doc_id, doc_vector in self.doc_vectors.items(): dot_product = sum(query_vector[term] * doc_vector[term] for term in query_vector if term in doc_vector) doc_norm = math.sqrt(sum(v**2 for v in doc_vector.values())) if query_norm * doc_norm == 0: score = 0 else: score = dot_product / (query_norm * doc_norm) scores[doc_id] = score # 显示结果 for doc_id, score in sorted(scores.items(), key=lambda x: x[1], reverse=True): if score > 0: source = '本地文档' if doc_id.endswith('.txt') else '网络文档' preview = ' '.join(self.documents[doc_id][:30]) + '...' self.result_tree.insert('', 'end', values=( doc_id, source, f"{score:.4f}", preview )) def perform_boolean_search(self, query): """执行布尔搜索""" # 解析查询字符串,支持 AND(&), OR(|), NOT(!), 括号() tokens = re.findall(r'[\(\)!&\|]|[\u4e00-\u9fa5]+', query) query_terms = [] operators = [] for token in tokens: if token in {'&', '|', '!', '(', ')'}: operators.append(token) else: # 直接使用原始词项(不进行分词) if token not in self.stop_words and re.match(r'^[\u4e00-\u9fa5]+$', token): query_terms.append(token) if not query_terms: messagebox.showwarning("警告", "查询词无效或均为停用词") return # 执行布尔查询 try: result_docs = self.evaluate_boolean_expression(query_terms, operators) except Exception as e: messagebox.showerror("错误", f"布尔表达式解析错误: {str(e)}") return # 显示结果 for doc_id in result_docs: source = '本地文档' if doc_id.endswith('.txt') else '网络文档' preview = ' '.join(self.documents[doc_id][:30]) + '...' self.result_tree.insert('', 'end', values=( doc_id, source, "1.0000", # 布尔查询结果只有匹配/不匹配 preview )) def evaluate_boolean_expression(self, terms, operators): """评估布尔表达式""" # 使用栈式表达式求 docs_stack = [] op_stack = [] for token in terms + operators: if token in terms: # 获取词项的文档集合 if token in self.inverted_index: docs = set(self.inverted_index[token]['doc_ids'].keys()) else: docs = set() docs_stack.append(docs) elif token == '(': op_stack.append(token) elif token == ')': # 处理括号内的表达式 while op_stack and op_stack[-1] != '(': op = op_stack.pop() self.apply_operator(op, docs_stack) op_stack.pop() # 弹出左括号 elif token in {'&', '|', '!'}: # 处理操作符优先级 while (op_stack and op_stack[-1] != '(' and self.operator_precedence(op_stack[-1]) >= self.operator_precedence(token)): op = op_stack.pop() self.apply_operator(op, docs_stack) op_stack.append(token) # 处理剩余操作符 while op_stack: op = op_stack.pop() self.apply_operator(op, docs_stack) if not docs_stack: return set() return docs_stack[0] def operator_precedence(self, op): """定义操作符优先级""" if op == '!': return 3 elif op == '&': return 2 elif op == '|': return 1 return 0 def apply_operator(self, op, docs_stack): """应用布尔操作符""" if op == '!': if len(docs_stack) < 1: raise ValueError("操作数不足") operand = docs_stack.pop() result = set(self.documents.keys()) - operand docs_stack.append(result) elif op in {'&', '|'}: if len(docs_stack) < 2: raise ValueError("操作数不足") right = docs_stack.pop() left = docs_stack.pop() if op == '&': result = left & right else: # op == '|' result = left | right docs_stack.append(result) def perform_phrase_search(self, query): """执行短语搜索""" # 预处理查询 query_terms = self.preprocess_text(query) if not query_terms: messagebox.showwarning("警告", "查询词无效或均为停用词") return # 查找所有包含第一个词的文档 if query_terms[0] not in self.inverted_index: return # 没有匹配结果 candidate_docs = set(self.inverted_index[query_terms[0]]['doc_ids'].keys()) # 检查每个候选文档是否包含完整短语 result_docs = [] for doc_id in candidate_docs: # 检查后续词是否按顺序出现在文档中 positions = self.inverted_index[query_terms[0]]['doc_ids'][doc_id] found = False for pos in positions: match = True for i in range(1, len(query_terms)): next_term = query_terms[i] if next_term not in self.inverted_index: match = False break if doc_id not in self.inverted_index[next_term]['doc_ids']: match = False break # 检查下一个词是否出现在预期位置 expected_pos = pos + i if expected_pos not in self.inverted_index[next_term]['doc_ids'][doc_id]: match = False break if match: found = True break if found: result_docs.append(doc_id) # 显示结果 for doc_id in result_docs: source = '本地文档' if doc_id.endswith('.txt') else '网络文档' preview = ' '.join(self.documents[doc_id][:30]) + '...' self.result_tree.insert('', 'end', values=( doc_id, source, "1.0000", # 短语查询结果只有匹配/不匹配 preview )) def perform_expanded_search(self, query): """执行查询扩展搜索""" # 预处理查询 query_terms = self.preprocess_text(query) if not query_terms: messagebox.showwarning("警告", "查询词无效或均为停用词") return # 扩展查询词 expanded_terms = [] for term in query_terms: # 添加原始词 expanded_terms.append(term) # 添加同义词 if term in self.synonym_dict: for syn, weight in self.synonym_dict[term].items(): expanded_terms.extend([(syn, weight)] * int(weight * 10)) # 按权重比例扩展 # 添加领域相关术语 for domain, terms in self.domain_categories.items(): if term in terms: for related_term in terms: if related_term != term and related_term not in self.synonym_dict.get(term, {}): expanded_terms.append((related_term, 0.5)) # 领域相关术语权重较低 # 构建扩展查询向量 query_vector = defaultdict(float) term_counts = defaultdict(float) for item in expanded_terms: if isinstance(item, tuple): term, weight = item term_counts[term] += weight else: term_counts[item] += 1.0 # 原始词权重为1.0 # 计算TF-IDF total_weight = sum(term_counts.values()) for term, count in term_counts.items(): if term in self.inverted_index: tf = count / total_weight idf = self.inverted_index[term]['idf'] query_vector[term] = tf * idf # 计算相似度 scores = {} query_norm = math.sqrt(sum(v**2 for v in query_vector.values())) for doc_id, doc_vector in self.doc_vectors.items(): dot_product = sum(query_vector[term] * doc_vector[term] for term in query_vector if term in doc_vector) doc_norm = math.sqrt(sum(v**2 for v in doc_vector.values())) if query_norm * doc_norm == 0: score = 0 else: score = dot_product / (query_norm * doc_norm) scores[doc_id] = score # 显示结果 for doc_id, score in sorted(scores.items(), key=lambda x: x[1], reverse=True): if score > 0: source = '本地文档' if doc_id.endswith('.txt') else '网络文档' preview = ' '.join(self.documents[doc_id][:30]) + '...' self.result_tree.insert('', 'end', values=( doc_id, source, f"{score:.4f}", preview )) def show_detail(self, event): """显示文档详情""" selected = self.result_tree.selection() if not selected: return item = self.result_tree.item(selected[0]) doc_id = item['values'][0] content = ' '.join(self.documents.get(doc_id, [])) self.detail_text.delete(1.0, tk.END) self.detail_text.insert(tk.END, content) if __name__ == "__main__": jieba.initialize() root = tk.Tk() app = InformationRetrievalSystem(root) root.mainloop()采用python程序设计语言,进行分词,再去掉停用词标点符号等,生成文档的词典,接着根据词典文档内容生成词项的倒排记录表(含位置信息),然后根据搜索关键字,进行满足某个布尔条件的检索并实现短语查询,结果文档按余弦相似度计算结果排序,完成查询扩展,最后提交设计程序课程设计报告。 要求有UI界面,与或非的输入应有按钮或下拉列表辅助,显示出倒排记录表,结果文档的内容显示出来,相应的词项突出显示。
05-29
from flask import Flask from flask import jsonify, request from flask_cors import CORS from flask_sqlalchemy import SQLAlchemy from werkzeug.utils import secure_filename import os import requests from bs4 import BeautifulSoup import re import json from snownlp import SnowNLP from openpyxl import Workbook import time from selenium import webdriver from selenium.webdriver.common.by import By from selenium.common.exceptions import NoSuchElementException from selenium.webdriver.edge.options import Options as EdgeOptions import xml.etree.ElementTree as ET from collections import Counter import jieba # 中文分词库 from datetime import datetime import pandas as pd import matplotlib.pyplot as plt import emoji app = Flask(__name__) app.config['SQLALCHEMY_DATABASE_URI'] = 'mysql+pymysql://root:root@127.0.0.1/analysisdb' app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = True # 设置这一项是每次请求结束后都会自动提交数据库中的变动 app.config['UPLOAD_FOLDER'] = 'static/uploadfiles' # 上传文件保存目录 app.config['ALLOWED_EXTENSIONS'] = {'png', 'jpg', 'jpeg', 'gif'} # 允许的文件类型 app.config['MAX_CONTENT_LENGTH'] = 2 * 1024 * 1024 # 文件大小限制: 2MB app.secret_key = 'asfda8r9q3y9qy#%GFSD^%WTAfasdfasqwe' # r'/*' 是通配符,让本服务器所有的URL 都允许跨域请求 CORS(app, resources=r'/*') # 实例化 db = SQLAlchemy(app) # 确保上传目录存在 os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True) # 设置headers headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3', 'Referer': 'https://www.bilibili.com/' } # 读取停用词表 stoppath = 'stoplist.txt' stopwords = open(stoppath, "r", encoding='utf-8').read() wordlist = [] # 存储弹幕信息 comments = [] # 存放评论信息 # 用户表 class Users(db.Model): __tablename__ = "users" id = db.Column(db.Integer, primary_key=True, autoincrement=True) username = db.Column(db.String(50), nullable=False) password = db.Column(db.String(10), nullable=False) role = db.Column(db.String(10), nullable=False) #(admin:管理员 user:用户) truename = db.Column(db.String(50)) sex = db.Column(db.String(10)) phone = db.Column(db.String(50)) headface = db.Column(db.String(255)) addtime = db.Column(db.String(20), nullable=False) def __init__(self, username, password, truename, phone, sex, role, addtime): self.username = username self.password = password self.truename = truename self.phone = phone self.sex = sex self.role = role self.addtime = addtime # 视频表 class Videos(db.Model): __tablename__ = "videos" id = db.Column(db.Integer, primary_key=True, autoincrement=True) cid = db.Column(db.String(100), nullable=False) title = db.Column(db.String(255), nullable=False) lovenum = db.Column(db.Integer) playnum = db.Column(db.Integer) addtime = db.Column(db.String(20), nullable=False) userid = db.Column(db.Integer, db.ForeignKey('upinfo.id')) bvid = db.Column(db.String(100), nullable=False) aid = db.Column(db.String(100), nullable=False) # 弹幕信息表 class Tanmu(db.Model): __tablename__ = "tanmu" id = db.Column(db.Integer, primary_key=True, autoincrement=True) cid = db.Column(db.String(100), nullable=False) txt = db.Column(db.String(255), nullable=False) level = db.Column(db.String(10), nullable=False) score = db.Column(db.String(10), nullable=False) def __init__(self, cid, txt, level, score): self.cid = cid self.txt = txt self.level = level self.score = score # 弹幕发表人员信息表 class Upinfo(db.Model): __tablename__ = "upinfo" id = db.Column(db.Integer, primary_key=True, autoincrement=True) username = db.Column(db.String(100), nullable=False) mid = db.Column(db.String(100), nullable=False) face = db.Column(db.String(255), nullable=True) videos = db.relationship('Videos', backref='upinfo', lazy=True) # 视频评论表 class Comments(db.Model): __tablename__ = "comments" id = db.Column(db.Integer, primary_key=True, autoincrement=True) cid = db.Column(db.String(100), nullable=False) username = db.Column(db.String(100), nullable=False) content = db.Column(db.TEXT, nullable=False) likenum = db.Column(db.String(10)) sendtime = db.Column(db.String(20)) level = db.Column(db.String(10), nullable=False) score = db.Column(db.String(10), nullable=False) def __init__(self, cid, username, content, likenum, sendtime, level, score): self.cid = cid self.username = username self.content = content self.likenum = likenum self.sendtime = sendtime self.level = level self.score = score # 日志表 class Syslog(db.Model): __tablename__ = "syslog" id = db.Column(db.Integer, primary_key=True, autoincrement=True) username = db.Column(db.String(100), nullable=False) operation = db.Column(db.String(200), nullable=False) method = db.Column(db.String(200), nullable=False) addtime = db.Column(db.String(20), nullable=False) def __init__(self, username, operation, method, addtime): self.username = username self.operation = operation self.method = method self.addtime = addtime # 意见反馈表 class Feedbacks(db.Model): __tablename__ = "feedbacks" id = db.Column(db.Integer, primary_key=True, autoincrement=True) username = db.Column(db.String(100), nullable=False) phone = db.Column(db.String(20), nullable=False) title = db.Column(db.String(255), nullable=False) content = db.Column(db.TEXT, nullable=False) addtime = db.Column(db.String(20), nullable=False) def __init__(self, username, phone, title, content, addtime): self.username = username self.phone = phone self.title = title self.content = content self.addtime = addtime with app.app_context(): db.create_all() # 登录 @app.route('/login', methods=['POST']) #'login'是接口路径,methods不写,则默认get请求 def login(): data = request.get_json() username = data.get("account") # 用户名 password = data.get("password") # 密码 role = data.get("role") # 角色 user = Users.query.filter_by(username=username).first() if user: if user.password == password: if user.role == role: user_data = {} user_data['id'] = user.id user_data['username'] = user.username user_data['password'] = user.password user_data['role'] = user.role user_data['truename'] = user.truename user_data['phone'] = user.phone user_data['sex'] = user.sex user_data['addtime'] = user.addtime return jsonify({"code": 200, "msg": "登录成功", "data": user_data}) else: return jsonify({"code": 500, "msg": "用户身份错误!"}) else: return jsonify({"code": 500, "msg": "密码错误!"}) else: return jsonify({"code": 500, "msg": "用户不存在!"}) # 注册 @app.route('/register', methods=['POST']) def register(): data = request.get_json() # 获取JSON数据 username = data.get("account") # 用户名 password = data.get("password") # 密码 user = Users.query.filter_by(username=username).first() if user: return jsonify({"code": 500, "msg": "用户已经存在!"}) else: # 获取当前时间字符串 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_user = Users(username, password, '', '', '', 'user', cur_time) db.session.add(new_user) db.session.commit() return jsonify({"code": 200, "msg": "注册成功"}) # 获取所有用户 @app.route('/showUser', methods=['POST','GET']) def showUser(): # 接收查询条件 data = request.get_json() query = data.get("params").get("accountName") if query != "": page_objs = Users.query.filter_by(role='user').filter(Users.username.like('%{}%'.format(query))).paginate( page=int(data.get("params").get("page", 1)), per_page=int(data.get("params").get("limit", 15)), error_out=False, max_per_page=50 ).items totalcount = Users.query.filter_by(role='user').filter(Users.username.like('%{}%'.format(query))).count() else: page_objs = Users.query.filter_by(role='user').paginate( page=int(data.get("params").get("page", 1)), per_page=int(data.get("params").get("limit", 15)), error_out=False, max_per_page=50 ).items totalcount = Users.query.filter_by(role='user').count() output = [] for user in page_objs: user_data = {} user_data['id'] = user.id user_data['username'] = user.username user_data['truename'] = user.truename user_data['sex'] = user.sex user_data['phone'] = user.phone user_data['headface'] = user.headface user_data['addtime'] = user.addtime output.append(user_data) # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog('admin', 'query', 'showUser', cur_time) db.session.add(new_log) db.session.commit() # 返回数据 data = {} data['list'] = output data['total'] = totalcount if totalcount == 0: return jsonify({"code": 200, "msg": "无数据", "data": []}) else: return jsonify({"code": 200, "msg": "处理成功", "data": data}) # 删除指定用户 @app.route("/deleteUser",methods=['POST','GET']) def deleteUser(): id = request.args.get("id") user = Users.query.get(id) if not user: return jsonify({"code": 500, "msg": "用户不存在!"}) db.session.delete(user) db.session.commit() # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog('admin', 'delete', 'deleteUser', cur_time) db.session.add(new_log) db.session.commit() return jsonify({"code": 200, "msg": "删除成功"}) # 获取所有视频 @app.route('/showVideo', methods=['POST','GET']) def showVideo(): # 接收查询条件 data = request.get_json() query = data.get("params").get("title") if query != "": page_objs = Videos.query.filter(Videos.title.like('%{}%'.format(query))).paginate( page=int(data.get("params").get("page", 1)), per_page=int(data.get("params").get("limit", 15)), error_out=False, max_per_page=50 ).items totalcount = Videos.query.filter(Videos.title.like('%{}%'.format(query))).count() else: page_objs = Videos.query.paginate( page=int(data.get("params").get("page", 1)), per_page=int(data.get("params").get("limit", 15)), error_out=False, max_per_page=50 ).items totalcount = Videos.query.count() output = [] for video in page_objs: video_data = {} video_data['id'] = video.id video_data['cid'] = video.cid video_data['title'] = video.title video_data['lovenum'] = video.lovenum video_data['playnum'] = video.playnum video_data['addtime'] = video.addtime video_data['userid'] = video.userid video_data['bvid'] = video.bvid video_data['aid'] = video.aid upinfo = Upinfo.query.filter_by(id=video.userid).first() video_data['username'] = upinfo.username output.append(video_data) # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog('admin', 'query', 'showVideo', cur_time) db.session.add(new_log) db.session.commit() # 返回数据 data = {} data['list'] = output data['total'] = totalcount if totalcount == 0: return jsonify({"code": 200, "msg": "无数据", "data": []}) else: return jsonify({"code": 200, "msg": "处理成功", "data": data}) # 删除视频 @app.route("/deleteVideo",methods=['POST','GET']) def deleteVideo(): id = request.args.get("id") video = Videos.query.get(id) if not video: return jsonify({"code": 500, "msg": "视频不存在!"}) db.session.delete(video) db.session.commit() # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog('admin', 'delete', 'deleteVideo', cur_time) db.session.add(new_log) db.session.commit() return jsonify({"code": 200, "msg": "删除成功"}) # 获取所有弹幕 @app.route('/showTanmu', methods=['POST','GET']) def showTanmu(): data = request.get_json() query = data.get("params").get("cid") if query != "": page_objs = Tanmu.query.filter_by(cid=query).paginate( page=int(data.get("params").get("page", 1)), per_page=int(data.get("params").get("limit", 15)), error_out=False, max_per_page=50 ).items totalcount = Tanmu.query.filter_by(cid=query).count() else: page_objs = Tanmu.query.paginate( page=int(data.get("params").get("page", 1)), per_page=int(data.get("params").get("limit", 15)), error_out=False, max_per_page=50 ).items totalcount = Tanmu.query.count() output = [] for tanmu in page_objs: video = Videos.query.filter_by(cid=tanmu.cid).first() tanmu_data = {} tanmu_data['id'] = tanmu.id tanmu_data['cid'] = video.cid tanmu_data['txt'] = tanmu.txt tanmu_data['level'] = tanmu.level tanmu_data['score'] = tanmu.score output.append(tanmu_data) # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog('admin', 'query', 'showTanmu', cur_time) db.session.add(new_log) db.session.commit() # 返回数据 data = {} data['list'] = output data['total'] = totalcount if totalcount == 0: return jsonify({"code": 200, "msg": "无数据", "data": []}) else: return jsonify({"code": 200, "msg": "处理成功", "data": data}) # 爬取弹幕 @app.route('/crawling', methods=['GET']) def crawling(): pattern = r"^//(?:www\.)?bilibili\.com/video/BV(\w+)" # 设置浏览器驱动 driver = setup_driver() keyword = '人工智能' baseurl = ( f"https://search.bilibili.com/all?keyword={keyword}&from_source=webtop_search&spm_id_from=333.1007&search_source=5") driver.get(baseurl) time.sleep(50) # 等待页面加载 for page in range(1, 10): print(f"get {page} page...") scroll_page(driver) res = driver.page_source # 提取数据 soup = BeautifulSoup(res, 'html.parser') divs = soup.select('div.bili-video-card__wrap') for div in divs: alink = div.find('a')['href'] match = re.match(pattern, alink) if match: bvid = match.group(1) get_video_info('BV' + bvid) # 翻页 if page < 10: try: next_page = driver.find_element(By.CSS_SELECTOR, 'div.bili-video-card__wrap') next_page.click() time.sleep(5) # 等待新页面加载 except NoSuchElementException: print("已到达最后一页") break save_excel() return jsonify({"code": 200, "msg": "爬取成功"}) def setup_driver(): options = EdgeOptions() options.add_experimental_option('excludeSwitches', ['enable-automation']) driver = webdriver.Edge(options=options) try: with open('E://PyWorkspaces/AnalysisSys/stealth.min.js', 'r') as f: js = f.read() driver.execute_cdp_cmd('Page.addScriptToEvaluateOnNewDocument', {'source': js}) except Exception as e: print(f"执行 JavaScript 脚本时出错: {e}") return driver def scroll_page(driver): """滚动页面以加载更多""" last_height = driver.execute_script("return document.body.scrollHeight") while True: driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") time.sleep(2) new_height = driver.execute_script("return document.body.scrollHeight") if new_height == last_height: break last_height = new_height def get_video_info(video_id): # 构造视频信息的URL video_url = f'https://api.bilibili.com/x/web-interface/view?bvid={video_id}' print(video_url) # 发送请求获取数据 response = requests.get(video_url, headers=headers) data = response.text # 解析JSON数据 video_info = json.loads(data) # 提取所需信息,比如视频标题、播放量等 bvid = video_info['data']['bvid'] aid = video_info['data']['aid'] cid = video_info['data']['cid'] title = video_info['data']['title'] play_count = video_info['data']['stat']['view'] like_count = video_info['data']['stat']['like'] pubdate = video_info['data']['pubdate'] addtime = datetime.fromtimestamp(pubdate) #获取up主信息 user_info = video_info['data']['owner'] mid = user_info['mid'] name = user_info['name'] face = user_info['face'] #检查up主在数据库中是否已经存在 up = Upinfo.query.filter_by(mid=mid).first() if up: userid = up.id else: #存储up主信息 upinfo = Upinfo(mid=mid, username=name, face=face) db.session.add(upinfo) db.session.commit() up = Upinfo.query.filter_by(mid=mid).first() userid = up.id #存储视频信息 video = Videos.query.filter_by(cid=cid).first() if video is None: videos = Videos(cid=cid, title=title, lovenum=like_count, playnum=play_count, addtime=addtime, userid=userid, bvid=bvid, aid=aid) db.session.add(videos) # 添加数据 db.session.commit() # 提交数据 get_bilibili_danmu(cid) get_bilibili_comments(cid, bvid, max_count=20) #爬取弹幕信息 def get_bilibili_danmu(cid): danmu_url = f"https://api.bilibili.com/x/v1/dm/list.so?oid={cid}" danmu_res = requests.get(danmu_url, headers=headers) danmu_res.encoding = 'utf-8' # 解析xml数据 root = ET.fromstring(danmu_res.text) for d in root.iter('d'): # 使用正则表达式去除标点符号特殊字符 text = re.sub(r"[!\"#$%&'()*+,-./:;<=>?@[\]^_`{|}~]", "", d.text) # 去除停用词 txt = remove_stopwords(text, stopwords) if txt != "" and len(txt) > 4: # 使用snownlp进行情感分析 s = SnowNLP(d.text) score = "{:.2f}".format(s.sentiments) if s.sentiments > 0.6: level = "正面" elif s.sentiments < 0.4: level = "负面" else: level = "中性" # 判断弹幕是否已经存在 istm = Tanmu.query.filter_by(cid=cid, txt=txt).first() if istm: continue # 将弹幕信息存入数组 dict = {} dict['cid'] = cid dict['txt'] = txt dict['level'] = level dict['score'] = score wordlist.append(dict) # 为Tanmu类属性赋 tanmu = Tanmu(cid=cid, txt=txt, level=level, score=score) db.session.add(tanmu) # 添加数据 db.session.commit() # 提交数据 def remove_stopwords(text, stopwords): words = text.split() filtered_words = [word for word in words if word not in stopwords] return " ".join(filtered_words) def save_excel(): # 创建一个新的工作簿 wb = Workbook() # 选择默认的工作表 ws = wb.active ws.append(['cID', 'userID', 'txt', 'level', 'score']) # 将弹幕信息存入excel文件 for item in wordlist: # 向excel文件插入一行数据 ws.append([item['cid'], item['userid'], item['txt'], item['level'], item['score']]) # 获取当前时间 cur_time = datetime.now().strftime("%Y%m%d%H%M%S") save_file_name = "tanmu_" + cur_time + ".xlsx" # 保存工作簿到文件 wb.save(save_file_name) wb1 = Workbook() ws1 = wb1.active ws1.append(['cID', 'user', 'content', 'like', 'time', 'level', 'score']) for item in comments: ws1.append(item['cid'], item['username'], item['content'], item['likenum'], item['sendtime'], item['level'], item['score']) save_file_name1 = "comment_" + cur_time + ".xlsx" wb1.save(save_file_name1) def all_characters_same(s): if len(s) == 0: # 先检查字符串是否为空 return True return len(set(s)) == 1 # 将字符串转换为集合,检查长度是否为1 def remove_escape_sequences(text): try: # 解码转义序列 decoded_text = text.encode('latin1').decode('unicode-escape') # 使用emoji库移除emoji return emoji.replace_emoji(decoded_text, replace='') except: return text # 爬取评论 def get_bilibili_comments(cid, video_bvid, max_count=200): """ 获取B站视频评论 :param video_bvid: 视频BV号 :param max_count: 最大获取评论数 :return: 评论列表 """ comments = [] url = f"https://api.bilibili.com/x/v2/reply/main?jsonp=jsonp&next=0&type=1&oid={video_bvid}&mode=3" headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36", "Referer": f"https://www.bilibili.com/video/{video_bvid}" } page = 0 while len(comments) < max_count: try: response = requests.get(url, headers=headers) data = response.json() if data['code'] != 0: print(f"获取评论失败: {data['message']}") break replies = data['data']['replies'] if not replies: break for reply in replies: # 检查字符串整个内容是否重复 if all_characters_same(reply['content']['message']) == True: continue comments.append({ 'user': reply['member']['uname'], 'content': reply['content']['message'], 'like': reply['like'], 'time': time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(reply['ctime'])) }) # 情感分析 s = SnowNLP(reply['content']['message']) score = "{:.2f}".format(s.sentiments) if s.sentiments > 0.6: level = "正面" elif s.sentiments < 0.4: level = "负面" else: level = "中性" content = remove_escape_sequences(reply['content']['message']) # 插入数据库 c = Comments(cid=cid, username=reply['member']['uname'], content=content, likenum=reply['like'], sendtime=time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(reply['ctime'])), level=level, score=score) db.session.add(c) # 添加数据 db.session.commit() # 提交数据 # 获取下一页评论 if 'cursor' in data['data'] and data['data']['cursor']['is_end'] == False: next_page = data['data']['cursor']['next'] url = f"https://api.bilibili.com/x/v2/reply/main?jsonp=jsonp&next={next_page}&type=1&oid={video_bvid}&mode=3" page += 1 print(f"正在获取第{page}页评论...") time.sleep(1) # 礼貌性延迟,避免请求过快 else: break except Exception as e: print(f"发生错误: {e}") break # 删除弹幕 @app.route("/deleteTanmu",methods=['POST','GET']) def deleteTanmu(): id = request.args.get("id") tanmu = Tanmu.query.get(id) if not tanmu: return jsonify({"code": 500, "msg": "弹幕记录不存在!"}) db.session.delete(tanmu) db.session.commit() # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog('admin', 'delete', 'deleteTanmu', cur_time) db.session.add(new_log) db.session.commit() return jsonify({"code": 200, "msg": "删除成功"}) # 获取所有评论 @app.route('/showComment', methods=['POST', 'GET']) def showComment(): data = request.get_json() query = data.get("params").get("cid") if query != "": page_objs = Comments.query.filter_by(cid=query).paginate( page=int(data.get("params").get("page", 1)), per_page=int(data.get("params").get("limit", 15)), error_out=False, max_per_page=50 ).items totalcount = Comments.query.filter_by(cid=query).count() else: page_objs = Comments.query.paginate( page=int(data.get("params").get("page", 1)), per_page=int(data.get("params").get("limit", 15)), error_out=False, max_per_page=50 ).items totalcount = Comments.query.count() output = [] for comment in page_objs: video = Videos.query.filter_by(cid=comment.cid).first() comment_data = {} comment_data['id'] = comment.id comment_data['vtitle'] = video.title comment_data['username'] = comment.username comment_data['content'] = comment.content comment_data['likenum'] = comment.likenum comment_data['sendtime'] = comment.sendtime comment_data['level'] = comment.level comment_data['score'] = comment.score output.append(comment_data) # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog('admin', 'query', 'showComment', cur_time) db.session.add(new_log) db.session.commit() # 返回数据 data = {} data['list'] = output data['total'] = totalcount if totalcount == 0: return jsonify({"code": 200, "msg": "无数据", "data": []}) else: return jsonify({"code": 200, "msg": "处理成功", "data": data}) # 删除评论 @app.route("/deleteComment",methods=['POST','GET']) def deleteComment(): id = request.args.get("id") comment = Comments.query.get(id) if not comment: return jsonify({"code": 500, "msg": "评论不存在!"}) db.session.delete(comment) db.session.commit() # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog('admin', 'delete', 'deleteComment', cur_time) db.session.add(new_log) db.session.commit() return jsonify({"code": 200, "msg": "删除成功"}) # 获取所有日志 @app.route('/showSyslog', methods=['POST', 'GET']) def showSyslog(): # 接收查询条件 data = request.get_json() query = data.get("params").get("querydate") if query != "": page_objs = Syslog.query.filter(Syslog.addtime.like('{}%'.format(query))).order_by(Syslog.addtime.desc()).paginate( page=int(data.get("params").get("page", 1)), per_page=int(data.get("params").get("limit", 15)), error_out=False, max_per_page=50 ).items totalcount = Syslog.query.filter(Syslog.addtime.like('{}%'.format(query))).count() else: page_objs = Syslog.query.order_by(Syslog.addtime.desc()).paginate( page=int(data.get("params").get("page", 1)), per_page=int(data.get("params").get("limit", 15)), error_out=False, max_per_page=50 ).items totalcount = Syslog.query.count() output = [] for log in page_objs: log_data = {} log_data['id'] = log.id log_data['username'] = log.username log_data['operation'] = log.operation log_data['method'] = log.method log_data['addtime'] = log.addtime output.append(log_data) # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog('admin', 'query', 'showSyslog', cur_time) db.session.add(new_log) db.session.commit() # 返回数据 data = {} data['list'] = output data['total'] = totalcount if totalcount == 0: return jsonify({"code": 200, "msg": "无数据", "data": []}) else: return jsonify({"code": 200, "msg": "处理成功", "data": data}) # 删除日志 @app.route("/deleteSyslog",methods=['POST','GET']) def deleteSyslog(): id = request.args.get("id") syslog = Syslog.query.get(id) if not syslog: return jsonify({"code": 500, "msg": "日志记录不存在!"}) db.session.delete(syslog) db.session.commit() # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog('admin', 'delete', 'deleteSyslog', cur_time) db.session.add(new_log) db.session.commit() return jsonify({"code": 200, "msg": "删除成功"}) # 获取所有反馈 @app.route('/showFeedback', methods=['POST', 'GET']) def showFeedback(): # 接收查询条件 data = request.get_json() page_objs = Feedbacks.query.order_by(Feedbacks.addtime.desc()).paginate( page=int(data.get("params").get("page", 1)), per_page=int(data.get("params").get("limit", 15)), error_out=False, max_per_page=50 ).items totalcount = Feedbacks.query.count() output = [] for feedback in page_objs: feedback_data = {} feedback_data['id'] = feedback.id feedback_data['username'] = feedback.username feedback_data['phone'] = feedback.phone feedback_data['title'] = feedback.title feedback_data['content'] = feedback.content feedback_data['addtime'] = feedback.addtime output.append(feedback_data) # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog('admin', 'query', 'showFeedback', cur_time) db.session.add(new_log) db.session.commit() # 返回数据 data = {} data['list'] = output data['total'] = totalcount if totalcount == 0: return jsonify({"code": 200, "msg": "无数据", "data": []}) else: return jsonify({"code": 200, "msg": "处理成功", "data": data}) # 删除反馈 @app.route("/deleteFeedback", methods=['POST','GET']) def deleteFeedback(): id = request.args.get("id") feedback = Feedbacks.query.get(id) if not feedback: return jsonify({"code": 500, "msg": "反馈不存在!"}) db.session.delete(feedback) db.session.commit() # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog('admin', 'delete', 'deleteFeedback', cur_time) db.session.add(new_log) db.session.commit() return jsonify({"code": 200, "msg": "删除成功"}) # 发布反馈 @app.route("/sendFeedback", methods=['POST']) def sendFeedback(): data = request.get_json() # 获取JSON数据 username = data.get("username") phone = data.get("phone") title = data.get("title") content = data.get("content") cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") feedback = Feedbacks(title, content, username, phone, cur_time) db.session.add(feedback) db.session.commit() # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog(username, 'add', 'sendFeedback', cur_time) db.session.add(new_log) db.session.commit() return jsonify({"code": 200, "msg": "反馈成功"}) #获取指定用户信息 @app.route('/getUser', methods=['POST','GET']) def getUser(): data = request.get_json() id = data.get("id") user = Users.query.get(int(id)) if not user: return jsonify({'message': '用户不存在!'}) user_data = {} user_data['id'] = user.id user_data['username'] = user.username user_data['password'] = user.password user_data['role'] = user.role user_data['truename'] = user.truename user_data['phone'] = user.phone user_data['sex'] = user.sex if user.headface: user_data['headface'] = "http://localhost:5000/static/uploadfiles/" + user.headface else: user_data['headface'] = "" # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog(user.username, 'query', 'getUser', cur_time) db.session.add(new_log) db.session.commit() return jsonify({"code": 200, "data": user_data}) def allowed_file(filename): """检查文件扩展名是否合法""" return '.' in filename and \ filename.rsplit('.', 1)[1].lower() in app.config['ALLOWED_EXTENSIONS'] # 上传头像 @app.route('/uploadAvatar', methods=['POST']) def uploadAvatar(): # 检查是否有文件被上传 if 'file' not in request.files: return jsonify({'error': 'No file part'}), 400 file = request.files['file'] # 检查是否选择了文件 if file.filename == '': return jsonify({'error': 'No selected file'}), 400 # 检查文件类型名称 if file and allowed_file(file.filename): # 安全处理文件名 filename = secure_filename(file.filename) # 生成唯一文件名防止冲突 unique_filename = f"{os.urandom(8).hex()}_{filename}" save_path = os.path.join(app.config['UPLOAD_FOLDER'], unique_filename) try: # 保存文件 file.save(save_path) return jsonify({ 'success': True, 'url': unique_filename, 'message': 'File uploaded successfully' }) except Exception as e: return jsonify({'error': str(e)}), 500 else: return jsonify({ 'error': 'File type not allowed', 'allowed_types': list(app.config['ALLOWED_EXTENSIONS']) }), 400 #修改用户信息 @app.route('/updateUser', methods=['POST', 'GET']) def updateUser(): data = request.get_json() id = data.get("id") username = data.get("username") truename = data.get("truename") phone = data.get("phone") sex = data.get("sex") headface = data.get("headface") user = Users.query.filter_by(id=int(id)).first() user.username = username user.truename = truename user.phone = phone user.sex = sex user.headface = headface db.session.commit() # 插入日志 cur_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") new_log = Syslog(user.username, 'update', 'updateUser', cur_time) db.session.add(new_log) db.session.commit() return jsonify({"code": 200, "msg": "更新成功"}) @app.route('/piechart', methods=['POST','GET']) def get_piechart(): data = request.get_json() cid = data.get("cid") positive = 0 neutral = 0 negative = 0 if cid == '': list = db.session.query(Tanmu.level, db.func.count(Tanmu.id)).group_by(Tanmu.level) else: list = db.session.query(Tanmu.level, db.func.count(Tanmu.id)).filter_by(cid=cid).group_by(Tanmu.level) for v in list: if v[0] == '正面': positive = v[1] elif v[0] == '中性': neutral = v[1] else: negative = v[1] data = { "positive": round(positive/(positive+neutral+negative)*100, 2), "neutral": round(neutral/(positive+neutral+negative)*100, 2), "negative": round(negative/(positive+neutral+negative)*100, 2) } return jsonify(data) @app.route('/cpiechart', methods=['POST','GET']) def get_comment_piechart(): data = request.get_json() cid = data.get("cid") positive = 0 neutral = 0 negative = 0 if cid == '': list = db.session.query(Comments.level, db.func.count(Comments.id)).group_by(Comments.level) else: list = db.session.query(Comments.level, db.func.count(Comments.id)).filter_by(cid=cid).group_by(Comments.level) for v in list: if v[0] == '正面': positive = v[1] elif v[0] == '中性': neutral = v[1] else: negative = v[1] data = { "positive": round(positive/(positive+neutral+negative)*100, 2), "neutral": round(neutral/(positive+neutral+negative)*100, 2), "negative": round(negative/(positive+neutral+negative)*100, 2) } return jsonify(data) @app.route('/wordcloud', methods=['POST','GET']) def wordcloud(): data = request.get_json() cid = data.get("cid") if cid == '': tanmus = Tanmu.query.all() comments = Comments.query.all() else: tanmus = Tanmu.query.filter_by(cid=cid).all() comments = Comments.query.filter_by(cid=cid).all() text_list = [] for t in tanmus: text_list.append(t.txt) for c in comments: text_list.append(c.content) # 中文分词处理 text = ' '.join(text_list) words = jieba.cut(text) # 统计词频 word_counts = Counter(words) # 过滤停用词单个字符 stop_words = {'的', '了', '', '是', '我', '你', '他', '这', '那', '在', '不'} filtered_words = [ {'name': word, 'value': count} for word, count in word_counts.items() if len(word) > 1 and word not in stop_words ] # 按词频排序并取前100 filtered_words.sort(key=lambda x: x['value'], reverse=True) return jsonify(filtered_words[:100]) @app.route('/barchart', methods=['POST','GET']) def barchart(): videos = [] playnums = [] videolist = Videos.query.order_by(Videos.playnum.desc())[:10] for v in videolist: videos.append(v.bvid) playnums.append(v.playnum) data = { 'categories': videos, 'values': playnums } return jsonify(data) @app.route('/zfchart', methods=['POST','GET']) def zfchart(): data = request.get_json() cid = data.get("cid") text_list = [] if cid == '': tanmus = Tanmu.query.all()[:30] else: tanmus = Tanmu.query.filter_by(cid=cid).all()[:30] for v in tanmus: text_list.append(v.txt) df = pd.DataFrame(text_list) # 将该列中的所有元素强制转换为字符串类型 df = df[0].astype(str) # 对弹幕内容列进行情感分析 sentiment = df.apply(lambda x: SnowNLP(x).sentiments) fig, ax = plt.subplots(figsize=(10, 6), facecolor='#1e1e1e') # 绘制情感得分直方图 ax.hist(sentiment, bins=30, color='skyblue', edgecolor='white', alpha=0.8) # 设置字体,以支持中文 plt.rcParams['font.sans-serif'] = ['SimHei'] # 设置中文字体为黑体 plt.rcParams['axes.unicode_minus'] = False # 用来正常显示负号 # 设置坐标轴标签标题(白色) ax.set_title('情感得分直方图', color='white', fontsize=14, pad=20) ax.set_xlabel('得分', color='white', fontsize=14) ax.set_ylabel('数量', color='white', fontsize=14) # 设置坐标轴刻度为白色 ax.tick_params(axis='both', which='both', colors='white') # 设置坐标轴边框为白色 for spine in ax.spines.values(): spine.set_color('white') spine.set_linewidth(1.5) # 设置边框线宽 # 设置网格线(可选) ax.grid(True, color='white', alpha=0.2, linestyle='--') cur_time = datetime.now().strftime("%Y%m%d%H%M%S") imgname = cur_time + '.png' imgpath = os.path.join('static', 'data', imgname) plt.savefig(imgpath, transparent=True) return jsonify({"imgname": imgname}) if __name__ == '__main__': app.run(port=5000, debug=True, host='0.0.0.0')上述代码中有没有训练模型的代码,如果没有,上面代码是怎么引用的模型,把模型代码提取出来并且解释一下
05-14
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值