CodiumAI PR-Agent扩展开发:创建自定义工具与功能
引言:解决PR审核的效率瓶颈
你是否还在为Pull Request(PR)审核耗费大量时间?手动检查代码质量、识别潜在问题、提供改进建议——这些重复且耗时的工作占用了开发者宝贵的创造性时间。根据CodiumAI 2024年开发者调查,工程师平均每周花费15小时在PR审核上,其中65%的时间用于机械性检查而非深度思考。
读完本文你将获得:
- 构建PR-Agent自定义工具的完整技术路线
- 掌握提示词工程与AI交互的核心模式
- 实现工具与Git服务集成的实战方案
- 部署和测试扩展功能的系统方法
PR-Agent工具架构解析
核心组件关系图
工具实现规范
所有PR-Agent工具遵循统一的实现规范,确保扩展性和一致性:
- 初始化方法:接收PR URL、CLI参数和AI处理器
- 核心执行流程:
run()方法协调整个处理流程 - AI交互模式:通过
_prepare_prediction()和_get_prediction()与AI模型交互 - 结果处理:格式化并发布AI生成的内容
以PRCodeSuggestions工具为例,其核心生命周期包括:
- 初始化Git提供者和AI处理器
- 获取PR差异数据并预处理
- 调用AI模型生成代码建议
- 自我反思优化建议质量
- 发布内联或汇总建议
从零构建自定义工具
步骤1:创建工具类结构
创建文件pr_agent/tools/pr_custom_analyzer.py,实现基础工具结构:
from functools import partial
from typing import Dict, List
from pr_agent.algo.ai_handlers.base_ai_handler import BaseAiHandler
from pr_agent.algo.ai_handlers.litellm_ai_handler import LiteLLMAIHandler
from pr_agent.git_providers import get_git_provider_with_context
from pr_agent.tools.base_tool import BaseTool
class PRCustomAnalyzer(BaseTool):
def __init__(self, pr_url: str, cli_mode=False, args: list = None,
ai_handler: partial[BaseAiHandler,] = LiteLLMAIHandler):
super().__init__(pr_url, args, ai_handler)
self.git_provider = get_git_provider_with_context(pr_url)
self.analysis_results = {}
# 初始化自定义参数
self.analysis_depth = args.depth if args and hasattr(args, 'depth') else "medium"
async def run(self):
"""执行自定义分析流程"""
try:
# 1. 准备分析数据
self._prepare_analysis_data()
# 2. 获取AI预测结果
prediction = await self._get_analysis_prediction()
# 3. 处理并格式化结果
self.analysis_results = self._process_prediction(prediction)
# 4. 发布分析结果
self._publish_results()
return self.analysis_results
except Exception as e:
self._handle_error(e)
return None
def _prepare_analysis_data(self):
"""准备分析所需的PR数据"""
self.pr_diff = self.git_provider.get_pr_diff()
self.pr_files = self.git_provider.get_files()
self.pr_description = self.git_provider.get_pr_description()
async def _get_analysis_prediction(self) -> Dict:
"""获取AI模型的分析结果"""
variables = {
"diff": self.pr_diff,
"description": self.pr_description,
"analysis_depth": self.analysis_depth
}
system_prompt = self._load_system_prompt()
user_prompt = self._load_user_prompt(variables)
response, _ = await self.ai_handler.chat_completion(
model="gpt-4", # 根据需求选择合适模型
system=system_prompt,
user=user_prompt
)
return self._parse_response(response)
# 其他辅助方法...
步骤2:设计AI提示词模板
在pr_agent/settings/custom_analyzer/目录下创建提示词文件:
system_prompt.toml
[custom_analyzer_prompt]
system="""You are a specialized code analyzer focused on identifying architectural patterns in PRs.
Your task is to analyze the provided code diff and identify:
1. Design patterns used or misused
2. Architectural inconsistencies
3. Potential scalability issues
Provide your analysis in a structured YAML format with sections for each pattern identified.
"""
user_prompt.toml
user="""Analyze the following PR for architectural patterns:
Title: {{title}}
Description: {{description}}
Diff:
```diff
{{diff}}
Analysis depth: {{analysis_depth}}
Respond with YAML containing:
- patterns_identified: list of patterns with description and location
- issues_found: list of architectural issues with severity (1-10) """
### 步骤3:实现核心业务逻辑
完善`PRCustomAnalyzer`类的核心方法:
```python
def _load_system_prompt(self):
"""加载系统提示词模板"""
return get_settings().custom_analyzer_prompt.system
def _load_user_prompt(self, variables):
"""加载并渲染用户提示词"""
environment = Environment(undefined=StrictUndefined)
user_prompt_template = get_settings().custom_analyzer_prompt.user
return environment.from_string(user_prompt_template).render(variables)
def _parse_response(self, response: str) -> Dict:
"""解析AI响应为结构化数据"""
try:
# 使用PyYAML解析响应
import yaml
return yaml.safe_load(response)
except yaml.YAMLError as e:
self.logger.error(f"Failed to parse AI response: {e}")
return {"error": "Invalid response format"}
def _publish_results(self):
"""发布分析结果到PR评论"""
if self.analysis_results and "patterns_identified" in self.analysis_results:
comment = self._format_comment()
self.git_provider.publish_comment(comment)
def _format_comment(self) -> str:
"""格式化分析结果为Markdown评论"""
comment = "## 📐 Architectural Analysis\n\n"
# 添加识别到的模式
comment += "### Identified Patterns\n"
for pattern in self.analysis_results.get("patterns_identified", []):
comment += f"- **{pattern['name']}**: {pattern['description']}\n"
comment += f" - Location: `{pattern['file']}:{pattern['line']}`\n"
# 添加发现的问题
comment += "\n### Architectural Issues\n"
for issue in self.analysis_results.get("issues_found", []):
severity = "🔴" * (issue['severity'] // 3) + "🟡" * ((issue['severity'] % 3) // 2)
comment += f"- {severity} **{issue['title']}**\n"
comment += f" - {issue['description']}\n"
return comment
步骤4:注册工具到CLI
修改pr_agent/cli.py文件,添加新工具的CLI命令:
def add_custom_commands(parser):
# 添加自定义分析器命令
custom_analyzer_parser = subparsers.add_parser(
'analyze-architecture',
help='Run architectural analysis on PR',
formatter_class=argparse.RawTextHelpFormatter
)
custom_analyzer_parser.add_argument('--pr-url', required=True, help='PR URL to analyze')
custom_analyzer_parser.add_argument(
'--depth',
choices=['shallow', 'medium', 'deep'],
default='medium',
help='Analysis depth (default: medium)'
)
custom_analyzer_parser.set_defaults(func=run_custom_analyzer)
def run_custom_analyzer(args):
from pr_agent.tools.pr_custom_analyzer import PRCustomAnalyzer
handler = PRCustomAnalyzer(
pr_url=args.pr_url,
args=args,
ai_handler=partial(LiteLLMAIHandler)
)
asyncio.run(handler.run())
提示词工程最佳实践
双阶段提示模式
PR-Agent采用双阶段提示模式提升AI输出质量:
- 生成阶段:引导AI生成初始结果
- 反思阶段:让AI自我评估并优化结果
生成阶段提示词示例:
[custom_analyzer_prompt]
system="""You are a specialized code analyzer focused on identifying architectural patterns in PRs.
Your task is to analyze the provided code diff and identify:
1. Design patterns used or misused
2. Architectural inconsistencies
3. Potential scalability issues
Provide your analysis in a structured YAML format with sections for each pattern identified.
"""
反思阶段提示词示例:
[custom_analyzer_reflect_prompt]
system="""You are a code review expert evaluating an architectural analysis.
Your task is to:
1. Verify if all critical architectural patterns are identified
2. Check if the severity of issues is appropriately assessed
3. Ensure the analysis covers scalability implications
Rate the analysis quality from 1-10 and provide improvement suggestions.
"""
提示词优化技巧
- 明确输出格式:使用Pydantic模型定义输出结构
- 提供上下文边界:清晰界定分析范围和关注点
- 示例驱动:提供高质量示例引导AI理解需求
- 迭代优化:基于实际输出不断调整提示词
工具集成与部署
与Git服务集成
部署为GitHub Action
创建.github/workflows/architecture-analysis.yml文件:
name: Architectural Analysis
on:
pull_request:
types: [opened, synchronize]
jobs:
analyze:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.10"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Run architectural analysis
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
python -m pr_agent.cli analyze-architecture \
--pr-url ${{ github.event.pull_request.html_url }} \
--depth deep
高级功能实现
自我反思机制
PR-Agent的自我反思机制显著提升建议质量。实现这一功能:
async def self_reflect_on_analysis(self, initial_analysis: Dict) -> Dict:
"""对初始分析结果进行自我反思和优化"""
# 准备反思提示词
system_prompt = """You are a critic specializing in evaluating architectural analysis.
Your task is to review the initial analysis and identify:
1. Missing architectural patterns
2. Inconsistent severity ratings
3. Unclear or incomplete descriptions
Provide specific suggestions for improvement."""
user_prompt = f"Initial analysis:\n{initial_analysis}\n\nPlease critique and improve this analysis."
# 获取反思结果
response, _ = await self.ai_handler.chat_completion(
model="gpt-4",
system=system_prompt,
user=user_prompt
)
# 应用改进建议
return self._apply_reflection(initial_analysis, response)
def _apply_reflection(self, initial: Dict, reflection: str) -> Dict:
"""应用反思结果到初始分析"""
# 解析反思建议
reflection_data = self._parse_response(reflection)
# 添加遗漏的模式
for pattern in reflection_data.get("missing_patterns", []):
initial["patterns_identified"].append(pattern)
# 调整问题严重性
for issue_update in reflection_data.get("severity_adjustments", []):
for issue in initial["issues_found"]:
if issue["id"] == issue_update["id"]:
issue["severity"] = issue_update["new_severity"]
issue["reason"] = issue_update["reason"]
return initial
增量分析优化
对于大型PR,实现增量分析提升性能:
def _get_incremental_diff(self):
"""获取相对于上次分析的增量差异"""
# 获取上次分析的提交哈希
last_commit = self._get_last_analysis_commit()
if not last_commit:
return self.git_provider.get_pr_diff()
# 获取增量差异
return self.git_provider.get_diff_between_commits(last_commit, self.git_provider.get_latest_commit())
def _track_analysis_state(self):
"""跟踪分析状态用于增量更新"""
state = {
"commit_hash": self.git_provider.get_latest_commit(),
"timestamp": datetime.now().isoformat(),
"analysis_id": str(uuid.uuid4())
}
# 存储状态(可以是文件、数据库或缓存)
with open(".analysis_state.json", "w") as f:
json.dump(state, f)
测试与质量保证
单元测试策略
创建tests/unittest/test_pr_custom_analyzer.py文件:
import unittest
from unittest.mock import Mock, patch
from pr_agent.tools.pr_custom_analyzer import PRCustomAnalyzer
class TestPRCustomAnalyzer(unittest.TestCase):
def setUp(self):
# 创建模拟依赖
self.mock_git_provider = Mock()
self.mock_ai_handler = Mock()
# 配置模拟返回值
self.mock_git_provider.get_pr_diff.return_value = """
diff --git a/src/utils/parser.py b/src/utils/parser.py
index 1234567..89abcde 100644
--- a/src/utils/parser.py
+++ b/src/utils/parser.py
@@ -10,7 +10,10 @@ class Parser:
def parse(data):
result = {}
- for key in data:
+ # Use comprehension for better performance
+ result = {
+ k: v for k, v in data.items() if v is not None
+ }
return result
"""
self.mock_git_provider.get_pr_description.return_value = "Refactor parser to use dict comprehension"
# 模拟AI响应
self.mock_ai_handler.chat_completion.return_value = (
"""{
"patterns_identified": [
{
"name": "Dictionary Comprehension",
"description": "Using dict comprehension for concise mapping creation",
"file": "src/utils/parser.py",
"line": 13
}
],
"issues_found": []
}""", "stop"
)
# 创建测试实例
self.analyzer = PRCustomAnalyzer(
pr_url="https://github.com/example/repo/pull/123",
args=Mock(),
ai_handler=lambda: self.mock_ai_handler
)
self.analyzer.git_provider = self.mock_git_provider
def test_analysis_flow(self):
"""测试分析流程是否正常工作"""
# 运行分析
result = self.analyzer.run()
# 验证结果
self.assertIsNotNone(result)
self.assertEqual(len(result["patterns_identified"]), 1)
self.assertEqual(result["patterns_identified"][0]["name"], "Dictionary Comprehension")
# 验证AI调用参数
self.mock_ai_handler.chat_completion.assert_called_once()
args, _ = self.mock_ai_handler.chat_completion.call_args
self.assertIn("dict comprehension", args[2]) # 用户提示应包含相关内容
def test_format_comment(self):
"""测试评论格式化功能"""
# 设置分析结果
self.analyzer.analysis_results = {
"patterns_identified": [
{
"name": "Factory Pattern",
"description": "Object creation delegated to factory method",
"file": "src/factories/user_factory.py",
"line": 42
}
],
"issues_found": [
{
"title": "Tight Coupling",
"description": "Factory directly instantiates concrete classes",
"severity": 7
}
]
}
# 生成评论
comment = self.analyzer._format_comment()
# 验证评论内容
self.assertIn("## 📐 Architectural Analysis", comment)
self.assertIn("Factory Pattern", comment)
self.assertIn("Tight Coupling", comment)
self.assertIn("🔴🟡", comment) # 对应严重性7的标记
总结与扩展方向
通过本文介绍的方法,你已经掌握了构建PR-Agent自定义工具的完整流程。关键要点包括:
- 遵循架构规范:继承BaseTool并实现标准方法
- 精心设计提示词:采用双阶段提示模式提升AI输出质量
- 完善错误处理:确保工具在各种场景下稳定运行
- 系统测试:覆盖单元测试和集成测试
进阶扩展方向
- 多模型集成:结合代码专用模型(如CodeLlama)提升分析准确性
- 实时反馈:实现WebSocket连接提供实时分析进度
- 自定义规则引擎:允许团队定义特定架构规则
- 历史数据分析:追踪PR质量指标随时间变化
PR-Agent的扩展生态系统不断增长,欢迎贡献你的创新工具和功能!
附录:开发资源
核心API参考
| 组件 | 关键方法 | 用途 |
|---|---|---|
GitProvider | get_pr_diff() | 获取PR代码差异 |
GitProvider | publish_comment(comment) | 发布评论到PR |
AIHandler | chat_completion(model, system, user) | 与AI模型交互 |
TokenHandler | calculate_tokens(text) | 估算文本token数量 |
开发环境设置
# 克隆仓库
git clone https://gitcode.com/gh_mirrors/pr/pr-agent
cd pr-agent
# 创建虚拟环境
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# 安装依赖
pip install -r requirements.txt
pip install -r requirements-dev.txt
# 运行测试
pytest tests/
贡献指南
- Fork仓库并创建特性分支
- 实现功能并添加测试
- 确保所有测试通过
- 提交PR并遵循代码规范
🔍 扩展挑战:尝试实现一个"技术债务分析"工具,自动识别PR中引入的技术债务并评估其影响。提示:结合代码复杂度 metrics 和维护性指数进行分析。
创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考



