基于Gemini-fullstack-langgraph-quickstart二次开发

最新推荐文章于 2025-07-09 17:23:20 发布

jdmike

最新推荐文章于 2025-07-09 17:23:20 发布

阅读量990

点赞数 10

CC 4.0 BY-SA版权

文章标签： gemini deepresearch llm agent

本文链接：https://blog.youkuaiyun.com/richardsz_/article/details/148927237

一、项目简介

获取项目源码方式：关注公众号《英博的AI实验室》，回复deepresearch即可

一句话总结：Deepresearch Fullstack LangGraph Quickstart 是基于gemini-deepresearch项目的二次开发项目，采用 React 前端与 LangGraph 智能体后端的全栈应用示例。该项目展示了如何利用自有或公开大模型与LangGraph 框架，构建具备“研究增强”能力的对话式 AI 系统。其核心亮点在于：智能体能够动态生成搜索词，自动调用 Google Search 进行网络检索，反思结果、发现知识缺口，并多轮迭代直至给出带有引用的高质量答案。

项目适合开发者快速上手 LangGraph 智能体开发、理解前后端协作模式，以及探索 LLM 在自动化研究场景下的应用。

项目展示

在这里插入图片描述

二、核心功能与特性

💬 前后端分离的全栈架构（React + LangGraph/FastAPI）
🧠 LangGraph 智能体驱动，具备复杂推理与多轮研究能力
🔍 LLM 动态生成搜索词，自动调用 Google Search API
🌐 网络检索与知识反思，自动发现并弥补知识盲区
📄 生成带引用的高质量答案
🔄 后端支持任意大模型接入，并适配langgraph-llm的接口，实现私有化定制，支持Langsmith动态跟踪调试
🔄 前端同步支持任意大模型接入，与后端协同，以热加载的形式实现部署，提升开发效率
🐳 支持 Docker 一键部署，生产环境可用

三、项目结构与架构设计

项目采用典型的前后端分离架构：

frontend/：基于 React + Vite 的前端应用，负责用户交互与界面渲染
backend/：基于 FastAPI 的后端服务，核心为 LangGraph 智能体
docker-compose.yml、Dockerfile：支持一键容器化部署
其他：Makefile、LICENSE、开发配置等

四、后端实现细节（LangGraph 智能体工作流）

1. 智能体核心流程

LangGraph 智能体的主逻辑位于 backend/src/agent/graph.py，其工作流如下：

flowchart TD
    Start([用户提问])
    QGen[生成初始搜索词]
    WebSearch[网络检索]
    Reflect[反思与知识缺口分析]
    Loop{信息充分？或达最大轮数？}
    FollowUp[生成补充搜索词]
    Finalize[生成最终答案并引用]
    End([输出答案])

    Start --> QGen --> WebSearch --> Reflect --> Loop
    Loop --否--> FollowUp --> WebSearch
    Loop --是--> Finalize --> End

主要节点说明

生成搜索词：调用 LLM，根据用户问题生成一组优化的搜索词。
网络检索：对每个搜索词，调用 Google Search API + Gemini LLM 获取网页内容与摘要。
反思与知识缺口分析：分析当前检索结果，判断信息是否充分，若有知识盲区则生成补充搜索词。
多轮迭代：若信息不充分，自动进入下一轮检索与反思，直至满足条件或达到最大轮数。
答案生成：综合所有检索结果，调用 Gemini LLM 生成带引用的最终答案。

2. 关键技术点

LangGraph：用于定义智能体的有向图工作流，支持节点并行、条件跳转、状态管理等。
LLM：支持自有或公开LLM ，用来提供搜索词生成、内容摘要、反思与答案生成等能力。
Google Search API：实现外部知识检索，增强智能体的事实性与时效性。
多轮推理与反思机制：通过循环结构自动发现并弥补知识盲区，提升答案质量。
引用与溯源：自动插入引用标记，便于用户追溯信息来源。

3. 依赖与环境

Python 3.11+
主要依赖（见 pyproject.toml）：
- langgraph、langchain、langchain-google-genai、fastapi、google-genai、python-dotenv
- 开发依赖：mypy、ruff、pytest

五、前端实现与技术栈

React 19 + Vite：现代高性能前端开发框架
Tailwind CSS：原子化 CSS，快速样式开发
Shadcn UI（Radix UI）：丰富的 UI 组件库
react-markdown：支持 Markdown 格式渲染
react-router-dom：路由管理
@langchain/langgraph-sdk：前端与后端智能体交互
TypeScript：类型安全开发

前端主要负责：

用户输入与消息展示
与后端 API 通信（REST/WS）
结果可视化与引用展示
组件化开发，易于扩展

六、快速启动与部署

获取项目源码方式：关注公众号《英博的AI实验室》，回复deepresearch即可

1. 快速启动

前提条件：

Node.js & npm
Python 3.11+
Google Gemini API Key，用作Google Search工具使用权限
本地LLM or OpenAI- Compatible LLM

步骤：

# 后端依赖安装
cd backend
pip install .
# 编辑目录下的.env,配置Gemini-api-key
vim .env 

# 前端依赖安装
cd frontend
npm install

# 启动开发环境（前后端热重载）
# 在项目目录下运行
make dev
# 或分别运行
# 后端: langgraph dev
# 前端: npm run dev

访问前端开发地址（如 http://localhost:5173/app），后端 API 默认 http://localhost:2024。

2. 生产部署

支持 Docker 一键部署，需配置 Gemini API Key 和 LangSmith API Key
后端依赖 Redis（消息流）与 Postgres（持久化）
生产环境下后端可直接服务前端静态资源

# 构建镜像
docker build -t gemini-fullstack-langgraph -f Dockerfile .

# 启动服务
GEMINI_API_KEY=<your_gemini_api_key> LANGSMITH_API_KEY=<your_langsmith_api_key> docker-compose up

访问 http://localhost:8123/app/。

七、创新点与优势

研究增强型智能体：自动化多轮检索与反思，显著提升答案的事实性与深度
LangGraph 工作流：灵活定义智能体推理流程，支持复杂分支与状态管理
全栈一体化：前后端解耦，便于扩展与二次开发
引用机制：答案可溯源，增强用户信任
现代前端体验：响应式 UI，交互流畅

八、适用场景与局限

适用场景

需要事实性强、可溯源的自动化研究/问答系统
企业知识检索、学术研究助手、智能客服等
LangGraph 智能体开发与教学示例

局限与改进空间

目前场景局限于DeepResearch场景，仅有一个google search工具，需要提前申请gemini api key（https://aistudio.google.com/）

九、总结

Deepsearch Fullstack LangGraph Quickstart 项目以现代全栈技术为基础，在Gemini Deepresearch基础上进行二次开发，结合 LangGraph 智能体与 LLM，展示了“研究增强型”对话 AI 的完整实现路径。其架构清晰、功能完善，适合开发者学习、二次开发与实际应用。未来可在多语言支持、插件扩展、私有化部署等方向进一步优化。

十、参考与扩展阅读

附件

关键提示词揭秘

# ------- 问句拆解提示词
query_writer_instructions = """Your goal is to generate sophisticated and diverse web search queries. These queries are intended for an advanced automated web research tool capable of analyzing complex results, following links, and synthesizing information.

Instructions:
- Always prefer a single search query, only add another query if the original question requests multiple aspects or elements and one query is not enough.
- Each query should focus on one specific aspect of the original question.
- Don't produce more than {number_queries} queries.
- Queries should be diverse, if the topic is broad, generate more than 1 query.
- Don't generate multiple similar queries, 1 is enough.
- Query should ensure that the most current information is gathered. The current date is {current_date}.

Format: 
- Format your response as a JSON object with ALL two of these exact keys:
   - "rationale": Brief explanation of why these queries are relevant
   - "query": A list of search queries

Example:

Topic: What revenue grew more last year apple stock or the number of people buying an iphone
```json
{{
    "rationale": "To answer this comparative growth question accurately, we need specific data points on Apple's stock performance and iPhone sales metrics. These queries target the precise financial information needed: company revenue trends, product-specific unit sales figures, and stock price movement over the same fiscal period for direct comparison.",
    "query": ["Apple total revenue growth fiscal year 2024", "iPhone unit sales growth fiscal year 2024", "Apple stock price growth fiscal year 2024"],
}}
```

Context: {research_topic}"""

# ------------ 搜索提示词
web_searcher_instructions = """Conduct targeted Google Searches to gather the most recent, credible information on "{research_topic}" and synthesize it into a verifiable text artifact.

Instructions:
- Query should ensure that the most current information is gathered. The current date is {current_date}.
- Conduct multiple, diverse searches to gather comprehensive information.
- Consolidate key findings while meticulously tracking the source(s) for each specific piece of information.
- The output should be a well-written summary or report based on your search findings. 
- Only include the information found in the search results, don't make up any information.

Research Topic:
{research_topic}
"""

# ------------ 反思提示词
reflection_instructions = """You are an expert research assistant analyzing summaries about "{research_topic}".

Instructions:
- Identify knowledge gaps or areas that need deeper exploration and generate a follow-up query. (1 or multiple).
- If provided summaries are sufficient to answer the user's question, don't generate a follow-up query.
- If there is a knowledge gap, generate a follow-up query that would help expand your understanding.
- Focus on technical details, implementation specifics, or emerging trends that weren't fully covered.

Requirements:
- Ensure the follow-up query is self-contained and includes necessary context for web search.

Output Format:
- Format your response as a JSON object with these exact keys:
   - "is_sufficient": true or false
   - "knowledge_gap": Describe what information is missing or needs clarification
   - "follow_up_queries": Write a specific question to address this gap

Example:
```json
{{
    "is_sufficient": true, // or false
    "knowledge_gap": "The summary lacks information about performance metrics and benchmarks", // "" if is_sufficient is true
    "follow_up_queries": ["What are typical performance benchmarks and metrics used to evaluate [specific technology]?"] // [] if is_sufficient is true
}}
```

Reflect carefully on the Summaries to identify knowledge gaps and produce a follow-up query. Then, produce your output following this JSON format:

Summaries:
{summaries}
"""



# ------ 写作提示词
answer_instructions = """Generate a high-quality answer to the user's question based on the provided summaries.

Instructions:
- The current date is {current_date}.
- You are the final step of a multi-step research process, don't mention that you are the final step. 
- You have access to all the information gathered from the previous steps.
- You have access to the user's question.
- Generate a high-quality answer to the user's question based on the provided summaries and the user's question.
- Include the sources you used from the Summaries in the answer correctly, use markdown format (e.g. [apnews](https://vertexaisearch.cloud.google.com/id/1-0)). THIS IS A MUST.

User Context:
- {research_topic}

Summaries:
{summaries}"""

获取项目源码方式：关注公众号《英博的AI实验室》，回复deepresearch即可