New Loop Vectorizer

LLVM的LoopVectorizer针对循环进行优化,将多个标量合并为向量,提升单精度矩阵高斯消元性能2.6倍。支持复杂循环向量化,如未知迭代次数和条件控制流,同时处理数组重叠问题。

Friday, December 7, 2012

New Loop Vectorizer

I would like to give a brief update regarding the development of the Loop Vectorizer. LLVM now has two vectorizers: The Loop Vectorizer, which operates on Loops, and the Basic Block Vectorizer, which optimizes straight-line code. These vectorizers focus on different optimization opportunities and use different techniques. The BB vectorizer merges multiple scalars that are found in the code into vectors while the Loop Vectorizer widens instructions in the original loop to operate on multiple consecutive loop iterations.

LLVM’s Loop Vectorizer is now available and will be useful for many people. It is not enabled by default, but can be enabled through clang using the command line flag “-mllvm -vectorize-loops”. We plan to enable the Loop Vectorizer by default as part of the LLVM 3.3 release.

The Loop Vectorizer can boost the performance of many loops, including some loops that are not vectorizable by GCC. In one benchmark, Linpack-pc, the Loop Vectorizer boosts the performance of gaussian elimination of single precision matrices from 984 MFlops to 2539 MFlops - a 2.6X boost in performance. The vectorizer also boosts the “GCC vectorization examples” benchmark by a geomean of 2.15X.

The LLVM Loop Vectorizer has a number of features that allow it to vectorize complex loops. Most of the features described in this post are available as part of the LLVM 3.2 release, but some features were added after the cutoff date. Here is one small example of a loop that the LLVM Loop Vectorizer can vectorize.

int foo(int *A, int *B, int n) {
  unsigned sum = 0;
  for (int i = 0; i < n; ++i)
    if (A[i] > B[i])
      sum += A[i] + 5;
  return sum;
}

In this example, the Loop Vectorizer uses a number of non-trivial features to vectorize the loop. The ‘sum’ variable is used by consecutive iterations of the loop. Normally, this would prevent vectorization, but the vectorizer can detect that ‘sum’ is a reduction variable. The variable ‘sum’ becomes a vector of integers, and at the end of the loop the elements of the array are added together to create the correct result. We support a number of different reduction operations, such as multiplication.

Another challenge that the Loop Vectorizer needs to overcome is the presence of control flow in the loop. The Loop Vectorizer is able to “flatten” the IF statement in the code and generate a single stream of instructions. Another important feature is the vectorization of loops with an unknown trip count. In this example, ‘n’ may not be a multiple of the vector width, and the vectorizer has to execute the last few iterations as scalar code. Keeping a scalar copy of the loop increases the code size.
The loop above is compiled into the ARMv7s assembly sequence below. Notice that the IF structure is replaced by the “vcgt” and “vbsl” instructions.

LBB0_3:
    vld1.32      {d26, d27}, [r3]
    vadd.i32     q12, q8, q9
    subs         r2, #4
    add.w        r3, r3, #16
    vcgt.s32     q0, q13 , q10
    vmla.i32     q12, q13, q11
    vbsl         q0, q12, q8
    vorr         q8, q0, q0
    bne    LBB0_3

In the second example below, the Loop Vectorizer must use two more features in order to vectorize the loop. In the loop below, the iteration start and finish points are unknown, and the Loop Vectorizer has a mechanism to vectorize loops that do not start at zero. This feature is important for loops that are converted from Fortran, because Fortran loops start at 1.
Another major challenge in this loop is memory safety. In our example, if the pointers A and B point to consecutive addresses, then it is illegal to vectorize the code because some elements of A will be written before they are read from array B.

Some programmers use the ‘restrict’ keyword to notify the compiler that the pointers are disjointed, but in our example, the Loop Vectorizer has no way of knowing that the pointers A and B are unique. The Loop Vectorizer handles this loop by placing code that checks, at runtime, if the arrays A and B point to disjointed memory locations. If arrays A and B overlap, then the scalar version of the loop is executed.

void bar(float *A, float *B, float K, int start, int end) {
 for (int i = start; i < end; ++i)
   A[i] *= B[i] + K;
}

The loop above is compiled into this X86 assembly sequence. Notice the use of the 8-wide YMM registers on systems that support AVX.

LBB1_4:
    vmovups (%rdx), %ymm2
    vaddps  %ymm1, %ymm2, %ymm2
    vmovups (%rax), %ymm3
    vmulps  %ymm2, %ymm3, %ymm2
    vmovups %ymm2, (%rax)
    addq    $32, %rax
    addq    $32, %rdx
    addq    $-8, %r11
    jne LBB1_4

In the last example, we don’t see a loop because it is hidden inside the “accumulate” function of the standard c++ library. This loop uses c++ iterators, which are pointers, and not integer indices, like we saw in the previous examples. The Loop Vectorizer detects pointer induction variables and can vectorize this loop. This feature is important because many C++ programs use iterators.

int baz(int *A, int n) {
  return std::accumulate(A, A + n, 0);
}

The loop above is compiled into this x86 assembly sequence.

LBB2_8:
    vmovdqu (%rcx,%rdx,4), %xmm1
    vpaddd  %xmm0, %xmm1, %xmm0
    addq    $4, %rdx
    cmpq    %rdx, %rsi
    jne LBB2_8

The Loop Vectorizer is a target independent IR-level optimization that depends on target-specific information from the different backends. It needs to select the optimal vector width and to decide if vectorization is worthwhile. Users can force a certain vector width using the command line flag “-mllvm -force-vector-width=X”, where X is the number of vector elements. At the moment, only the X86 backend provides detailed cost information, while other targets use a less accurate method.

The work on the Loop Vectorizer is not complete and the vectorizer has a long way to go. We plan to add additional vectorization features such as automatic alignment of buffers, vectorization of function calls and support for user pragmas. We also plan to improve the quality of the generated code.

Posted by Nadav Rotem at 10:12 AM

Labels: codegen, new-in-llvm-3.3, optimization

那这个呢# knowledge_manager.py (优化版) import json import os import uuid import time from datetime import datetime from collections import defaultdict from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.metrics.pairwise import cosine_similarity import numpy as np import logging # 配置日志 logging.basicConfig( level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' ) logger = logging.getLogger('KnowledgeManager') class KnowledgeManager: def __init__(self, storage_file="knowledge_base.json", auto_save=True, save_interval=300): """ 优化后的知识管理核心模块 参数: storage_file -- 知识库存储文件路径 auto_save -- 是否启用自动保存 save_interval -- 自动保存间隔(秒) """ self.storage_file = storage_file self.auto_save = auto_save self.save_interval = save_interval self.last_save_time = time.time() self.knowledge_base = self._initialize_knowledge_base() self.vectorizer = TfidfVectorizer(stop_words='english') self.vectorizer_lock = False # 向量器更新锁 self.relationship_types = ["similar", "dependency", "contrast", "part_of", "derived_from"] # 加载现有知识库 self.load_knowledge() # 启动自动保存线程 if self.auto_save: self._start_auto_save() def _initialize_knowledge_base(self): """初始化知识库结构""" return { "knowledge": {}, "relationships": [], "categories": defaultdict(list), "metadata": { "version": "1.0", "last_updated": datetime.now().isoformat(), "knowledge_count": 0, "learning_metrics": { "total_learning_time": 0.0, "last_learned": None, "mastery_levels": {} }, "integration_history": [], "backup_history": [] } } def _start_auto_save(self): """启动自动保存线程""" import threading def auto_save_loop(): while True: time.sleep(self.save_interval) try: if self.knowledge_base["metadata"]["knowledge_count"] > 0: self.save_knowledge() logger.info("自动保存知识库") except Exception as e: logger.error(f"自动保存失败: {e}") save_thread = threading.Thread(target=auto_save_loop, daemon=True) save_thread.start() def create_backup(self): """创建知识库备份""" backup_file = f"{self.storage_file}.backup.{datetime.now().strftime('%Y%m%d_%H%M%S')}" try: with open(backup_file, 'w', encoding='utf-8') as f: json.dump(self.knowledge_base, f, ensure_ascii=False, indent=2) # 记录备份历史 self.knowledge_base["metadata"]["backup_history"].append({ "backup_file": backup_file, "created_at": datetime.now().isoformat(), "knowledge_count": self.knowledge_base["metadata"]["knowledge_count"] }) logger.info(f"知识库备份已创建: {backup_file}") return True except Exception as e: logger.error(f"创建备份失败: {e}") return False def load_knowledge(self): """从文件加载知识库""" if os.path.exists(self.storage_file): try: with open(self.storage_file, 'r', encoding='utf-8') as f: self.knowledge_base = json.load(f) logger.info(f"知识库已加载,包含 {self.knowledge_base['metadata']['knowledge_count']} 条知识") # 确保分类是defaultdict if not isinstance(self.knowledge_base["categories"], defaultdict): categories = defaultdict(list) for cat, items in self.knowledge_base["categories"].items(): categories[cat] = items self.knowledge_base["categories"] = categories # 更新向量器 self._update_vectorizer() return True except Exception as e: logger.error(f"加载知识库失败: {e}") # 创建新知识库 self.knowledge_base = self._initialize_knowledge_base() return False return False def save_knowledge(self): """保存知识库到文件""" try: # 创建备份 self.create_backup() with open(self.storage_file, 'w', encoding='utf-8') as f: json.dump(self.knowledge_base, f, ensure_ascii=False, indent=2) self.knowledge_base["metadata"]["last_updated"] = datetime.now().isoformat() self.last_save_time = time.time() logger.info("知识库已保存") return True except Exception as e: logger.error(f"保存知识库失败: {e}") return False def add_knowledge(self, title, content, tags=None, category=None): """ 添加新知识条目 参数: title -- 知识标题 content -- 知识内容 tags -- 标签列表 (可选) category -- 分类 (可选) """ # 输入验证 if not title or not content: logger.warning("添加知识失败: 标题和内容不能为空") return None # 检查重复知识 existing = self.search_knowledge(title, top_n=1) if existing and existing[0]["similarity"] > 0.9: logger.info(f"相似知识已存在: {existing[0]['title']} (ID: {existing[0]['id']})") return existing[0]["id"] # 生成唯一ID knowledge_id = str(uuid.uuid4()) # 创建时间戳 now = datetime.now().isoformat() # 创建知识条目 new_entry = { "id": knowledge_id, "title": title, "content": content, "tags": tags or [], "category": category, "created_at": now, "updated_at": now } # 添加到知识库 self.knowledge_base["knowledge"][knowledge_id] = new_entry # 更新元数据 self.knowledge_base["metadata"]["knowledge_count"] += 1 self.knowledge_base["metadata"]["last_updated"] = now # 添加到分类 if category: self.knowledge_base["categories"][category].append(knowledge_id) # 自动创建关系 self._auto_create_relationships(knowledge_id) # 更新向量器 self._update_vectorizer() logger.info(f"知识添加成功! ID: {knowledge_id}") return knowledge_id def _auto_create_relationships(self, new_id): """自动为新知识创建关系""" try: new_entry = self.knowledge_base["knowledge"][new_id] # 1. 基于内容相似度 similarities = self.find_similar_knowledge(new_id, threshold=0.3) for knowledge_id, similarity in similarities: self.add_relationship(new_id, knowledge_id, "similar", similarity) # 2. 基于标签匹配 for knowledge_id, entry in self.knowledge_base["knowledge"].items(): if knowledge_id == new_id: continue # 计算标签匹配度 common_tags = set(new_entry["tags"]) & set(entry["tags"]) if common_tags: strength = len(common_tags) / (len(new_entry["tags"]) + len(entry["tags"]) - len(common_tags)) self.add_relationship(new_id, knowledge_id, "tag_match", strength) except Exception as e: logger.error(f"自动创建关系失败: {e}") def _update_vectorizer(self): """更新TF-IDF向量器""" if not self.knowledge_base["knowledge"] or self.vectorizer_lock: return try: # 设置锁防止并发更新 self.vectorizer_lock = True # 收集所有知识内容 all_content = [entry["content"] for entry in self.knowledge_base["knowledge"].values()] # 更新向量器 self.vectorizer.fit(all_content) except Exception as e: logger.error(f"更新向量器失败: {e}") finally: self.vectorizer_lock = False def add_relationship(self, source_id, target_id, rel_type, strength=0.5): """ 添加知识关系 参数: source_id -- 源知识ID target_id -- 目标知识ID rel_type -- 关系类型 (e.g., "similar", "dependency", "contrast") strength -- 关系强度 (0.0-1.0) """ # 验证关系类型 if rel_type not in self.relationship_types: logger.warning(f"无效的关系类型: {rel_type}。可用类型: {', '.join(self.relationship_types)}") return False # 验证知识ID存在 if source_id not in self.knowledge_base["knowledge"] or target_id not in self.knowledge_base["knowledge"]: logger.warning("添加关系失败: 源或目标知识不存在") return False # 检查关系是否已存在 for rel in self.knowledge_base["relationships"]: if rel["source"] == source_id and rel["target"] == target_id and rel["type"] == rel_type: # 更新现有关系 rel["strength"] = max(rel["strength"], strength) rel["updated_at"] = datetime.now().isoformat() logger.info(f"关系已更新: {source_id[:8]} → {target_id[:8]} ({rel_type}, {rel['strength']:.2f})") return True # 创建新关系 new_rel = { "source": source_id, "target": target_id, "type": rel_type, "strength": strength, "created_at": datetime.now().isoformat(), "updated_at": datetime.now().isoformat() } self.knowledge_base["relationships"].append(new_rel) logger.info(f"关系添加: {source_id[:8]} → {target_id[:8]} ({rel_type}, {strength:.2f})") return True def get_knowledge(self, knowledge_id): """根据ID获取知识条目""" return self.knowledge_base["knowledge"].get(knowledge_id) def update_knowledge(self, knowledge_id, title=None, content=None, tags=None, category=None): """更新知识条目""" if knowledge_id not in self.knowledge_base["knowledge"]: logger.warning("知识条目不存在") return False entry = self.knowledge_base["knowledge"][knowledge_id] # 更新字段 updated = False if title is not None and title != entry["title"]: entry["title"] = title updated = True if content is not None and content != entry["content"]: entry["content"] = content updated = True if tags is not None and set(tags) != set(entry["tags"]): entry["tags"] = tags updated = True if category is not None and category != entry.get("category"): # 移除旧分类 old_category = entry.get("category") if old_category and old_category in self.knowledge_base["categories"]: if knowledge_id in self.knowledge_base["categories"][old_category]: self.knowledge_base["categories"][old_category].remove(knowledge_id) # 添加新分类 entry["category"] = category self.knowledge_base["categories"][category].append(knowledge_id) updated = True if updated: entry["updated_at"] = datetime.now().isoformat() # 重新计算关系 self._auto_create_relationships(knowledge_id) # 更新向量器 self._update_vectorizer() logger.info("知识更新成功") return True else: logger.info("知识未更新") return False def delete_knowledge(self, knowledge_id): """删除知识条目""" if knowledge_id not in self.knowledge_base["knowledge"]: logger.warning("知识条目不存在") return False # 删除知识条目 del self.knowledge_base["knowledge"][knowledge_id] # 更新元数据 self.knowledge_base["metadata"]["knowledge_count"] -= 1 self.knowledge_base["metadata"]["last_updated"] = datetime.now().isoformat() # 从分类中移除 for category, items in self.knowledge_base["categories"].items(): if knowledge_id in items: items.remove(knowledge_id) # 删除相关关系 self.knowledge_base["relationships"] = [ rel for rel in self.knowledge_base["relationships"] if rel["source"] != knowledge_id and rel["target"] != knowledge_id ] # 更新向量器 self._update_vectorizer() logger.info("知识删除成功") return True def search_knowledge(self, query, top_n=5, similarity_threshold=0.1): """ 搜索知识库 参数: query -- 搜索查询 top_n -- 返回最相关的前n个结果 similarity_threshold -- 相似度阈值 """ if not self.knowledge_base["knowledge"]: return [] # 获取所有知识内容 all_ids = list(self.knowledge_base["knowledge"].keys()) all_content = [self.knowledge_base["knowledge"][kid]["content"] for kid in all_ids] # 添加查询到内容列表 all_content.append(query) # 计算TF-IDF向量 try: tfidf_matrix = self.vectorizer.transform(all_content) # 计算相似度(查询与所有知识条目) query_vector = tfidf_matrix[-1] knowledge_vectors = tfidf_matrix[:-1] similarities = cosine_similarity(query_vector, knowledge_vectors).flatten() # 获取最相关的结果 results = [] for i, sim in enumerate(similarities): if sim > similarity_threshold: knowledge_id = all_ids[i] entry = self.knowledge_base["knowledge"][knowledge_id] results.append({ "id": knowledge_id, "title": entry["title"], "content": entry["content"][:150] + "..." if len(entry["content"]) > 150 else entry["content"], "similarity": sim }) # 按相似度排序 results.sort(key=lambda x: x["similarity"], reverse=True) return results[:top_n] except Exception as e: logger.error(f"搜索失败: {e}") return [] def find_similar_knowledge(self, knowledge_id, threshold=0.2): """ 查找相似知识条目 参数: knowledge_id -- 参考知识ID threshold -- 相似度阈值 """ if knowledge_id not in self.knowledge_base["knowledge"]: return [] # 获取所有知识内容 all_ids = list(self.knowledge_base["knowledge"].keys()) all_content = [self.knowledge_base["knowledge"][kid]["content"] for kid in all_ids] # 计算TF-IDF向量 try: tfidf_matrix = self.vectorizer.transform(all_content) # 找到参考知识的索引 ref_idx = all_ids.index(knowledge_id) # 计算相似度 similarities = cosine_similarity(tfidf_matrix[ref_idx:ref_idx + 1], tfidf_matrix).flatten() # 收集相似结果 results = [] for i, sim in enumerate(similarities): if i != ref_idx and sim > threshold: results.append((all_ids[i], sim)) # 按相似度排序 results.sort(key=lambda x: x[1], reverse=True) return results except Exception as e: logger.error(f"查找相似知识失败: {e}") return [] def get_knowledge_graph_data(self, min_strength=0.1): """获取知识图谱数据(供可视化模块使用)""" nodes = [] edges = [] # 添加节点 for kid, entry in self.knowledge_base["knowledge"].items(): nodes.append({ "id": kid, "title": entry["title"], "category": entry.get("category", "未分类"), "tags": entry.get("tags", []) }) # 添加边 for rel in self.knowledge_base["relationships"]: if rel["strength"] >= min_strength: edges.append({ "source": rel["source"], "target": rel["target"], "type": rel["type"], "strength": rel["strength"] }) return {"nodes": nodes, "links": edges} def get_timeline_data(self): """获取时间线数据(供可视化模块使用)""" timeline = defaultdict(int) for entry in self.knowledge_base["knowledge"].values(): try: date = datetime.fromisoformat(entry["created_at"]) month_key = date.strftime("%Y-%m") timeline[month_key] += 1 except: continue # 转换为排序列表 sorted_timeline = sorted(timeline.items(), key=lambda x: x[0]) return [{"month": k, "count": v} for k, v in sorted_timeline] def get_integration_history(self): """获取整合历史记录""" return self.knowledge_base["metadata"]["integration_history"] def get_report_data(self): """生成知识库报告数据(供可视化模块使用)""" report = { "knowledge_count": self.knowledge_base["metadata"]["knowledge_count"], "last_updated": self.knowledge_base["metadata"]["last_updated"], "categories": {}, "tag_cloud": defaultdict(int), "relationship_stats": defaultdict(int), "timeline_data": [], "backup_history": self.knowledge_base["metadata"]["backup_history"] } # 分类统计 for category, items in self.knowledge_base["categories"].items(): report["categories"][category] = len(items) # 标签统计 for entry in self.knowledge_base["knowledge"].values(): for tag in entry.get("tags", []): report["tag_cloud"][tag] += 1 # 关系统计 for rel in self.knowledge_base["relationships"]: report["relationship_stats"][rel["type"]] += 1 # 时间线数据 if self.knowledge_base["knowledge"]: dates = [] for entry in self.knowledge_base["knowledge"].values(): try: dates.append(datetime.fromisoformat(entry["created_at"])) except: continue # 按月统计 months = defaultdict(int) for date in dates: month_key = date.strftime("%Y-%m") months[month_key] += 1 # 按时间排序 sorted_months = sorted(months.items(), key=lambda x: x[0]) report["timeline_data"] = [{"month": k, "count": v} for k, v in sorted_months] return report def export_data(self, format="dict"): """ 导出知识库数据 参数: format -- 导出格式 (dict, json) """ if format == "json": return json.dumps(self.knowledge_base, ensure_ascii=False, indent=2) else: return self.knowledge_base # 使用示例 if __name__ == "__main__": # 创建知识管理器 km = KnowledgeManager(auto_save=True) # 添加知识条目 km.add_knowledge( title="Python列表推导式", content="列表推导式提供了一种简洁的方法来创建列表。语法: [expression for item in iterable if condition]", tags=["Python", "编程技巧"], category="编程" ) km.add_knowledge( title="神经网络基础", content="神经网络由输入层、隐藏层和输出层组成。使用反向传播算法训练权重。", tags=["机器学习", "神经网络"], category="AI" ) km.add_knowledge( title="Git基本命令", content="常用Git命令: git init, git add, git commit, git push, git pull", tags=["Git", "版本控制"], category="开发工具" ) # 添加关系 py_id = list(km.knowledge_base["knowledge"].keys())[0] ai_id = list(km.knowledge_base["knowledge"].keys())[1] km.add_relationship(py_id, ai_id, "related", 0.6) # 搜索知识 print("\n搜索 'Python':") results = km.search_knowledge("Python") for res in results: print(f"{res['title']} (相似度: {res['similarity']:.2f})") # 获取图谱数据 graph_data = km.get_knowledge_graph_data() print(f"\n知识图谱数据: {len(graph_data['nodes'])} 节点, {len(graph_data['links'])} 边") # 获取时间线数据 timeline_data = km.get_timeline_data() print(f"\n时间线数据: {len(timeline_data)} 个月份记录") # 获取报告数据 report_data = km.get_report_data() print(f"\n报告数据: {report_data['knowledge_count']} 条知识") # 保存知识库 km.save_knowledge()
08-11
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值