SurfSense API文档:构建自定义AI应用指南
目录
1. 简介
SurfSense是一个高度可定制的AI研究代理(AI Research Agent),类似于NotebookLM或Perplexity,能够连接到外部数据源如搜索引擎(Tavily)、Slack、Notion等。本API文档详细介绍了如何使用SurfSense的RESTful API构建自定义AI应用,实现与外部数据源的集成、文档处理、聊天交互等功能。
1.1 核心功能
- 多源数据集成:支持GitHub、Slack、Notion等多种外部数据源连接器
- 智能文档处理:上传、解析和索引各种类型的文档
- 对话式AI交互:基于检索增强生成(RAG)的智能对话
- 自定义LLM配置:支持配置不同类型的语言模型
- 灵活的搜索空间:组织和管理不同的数据源集合
2. 快速开始
2.1 环境准备
# 克隆仓库
git clone https://gitcode.com/GitHub_Trending/su/SurfSense
# 进入项目目录
cd SurfSense
# 启动后端服务
cd surfsense_backend
uvicorn app.app:app --host 0.0.0.0 --port 8000
2.2 API基础信息
- 基础URL:
http://localhost:8000/api/v1 - API版本: v1
- 数据格式: JSON
- 认证方式: JWT Token
2.3 第一个API调用
# 获取认证令牌
curl -X POST "http://localhost:8000/auth/jwt/login" \
-H "Content-Type: application/json" \
-d '{"username": "your_email@example.com", "password": "your_password"}'
# 创建搜索空间
curl -X POST "http://localhost:8000/api/v1/searchspaces/" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"name": "My First Search Space", "description": "A demo search space"}'
3. 认证与授权
SurfSense使用JWT(JSON Web Token)进行认证,确保API调用的安全性。
3.1 获取认证令牌
POST /auth/jwt/login
请求体:
{
"username": "user@example.com",
"password": "secure_password"
}
响应:
{
"access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"token_type": "bearer"
}
3.2 使用认证令牌
所有API请求都需要在HTTP头部包含认证令牌:
GET /api/v1/chats/
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
3.3 令牌过期与刷新
JWT令牌的有效期为24小时。过期后,需要重新登录获取新令牌。
4. 核心API端点
4.1 聊天API
4.1.1 创建聊天会话
POST /api/v1/chats/
请求体:
{
"type": "RESEARCH",
"title": "AI Research Chat",
"search_space_id": 1,
"messages": []
}
响应:
{
"id": 1,
"type": "RESEARCH",
"title": "AI Research Chat",
"search_space_id": 1,
"messages": [],
"created_at": "2025-09-08T10:30:00Z"
}
4.1.2 发送聊天消息
POST /api/v1/chat
请求体:
{
"messages": [
{
"role": "user",
"content": "What's the latest research on LLMs?"
}
],
"data": {
"search_space_id": 1,
"research_mode": "deep",
"selected_connectors": ["github", "slack"]
}
}
响应: StreamingResponse(流式响应)
4.1.3 获取聊天历史
GET /api/v1/chats/{chat_id}
响应:
{
"id": 1,
"type": "RESEARCH",
"title": "AI Research Chat",
"search_space_id": 1,
"messages": [
{
"role": "user",
"content": "What's the latest research on LLMs?"
},
{
"role": "assistant",
"content": "Recent research on LLMs has focused on several key areas..."
}
],
"created_at": "2025-09-08T10:30:00Z"
}
4.2 文档API
4.2.1 上传文档
POST /api/v1/documents/fileupload
请求体:
search_space_id: 1 (表单数据)files: [file1.pdf, file2.docx] (文件上传)
响应:
{
"message": "Files uploaded for processing"
}
4.2.2 创建URL文档
POST /api/v1/documents/
请求体:
{
"search_space_id": 1,
"document_type": "CRAWLED_URL",
"content": ["https://example.com/research-paper.pdf"]
}
响应:
{
"message": "Documents processed successfully"
}
4.2.3 获取文档列表
GET /api/v1/documents/?search_space_id=1
响应:
[
{
"id": 1,
"title": "Research Paper on LLMs",
"document_type": "CRAWLED_URL",
"document_metadata": {
"url": "https://example.com/research-paper.pdf"
},
"content": "...",
"created_at": "2025-09-08T11:45:00Z",
"search_space_id": 1
}
]
4.3 连接器API
4.3.1 创建GitHub连接器
POST /api/v1/search-source-connectors/
请求体:
{
"name": "My GitHub Connector",
"connector_type": "GITHUB_CONNECTOR",
"is_indexable": true,
"config": {
"GITHUB_PAT": "ghp_your_github_personal_access_token",
"REPOSITORIES": ["owner/repo1", "owner/repo2"]
}
}
响应:
{
"id": 1,
"name": "My GitHub Connector",
"connector_type": "GITHUB_CONNECTOR",
"is_indexable": true,
"config": {
"GITHUB_PAT": "ghp_your_github_personal_access_token",
"REPOSITORIES": ["owner/repo1", "owner/repo2"]
},
"user_id": "550e8400-e29b-41d4-a716-446655440000",
"created_at": "2025-09-08T13:00:00Z",
"last_indexed_at": null
}
4.3.2 索引连接器内容
POST /api/v1/search-source-connectors/{connector_id}/index?search_space_id=1
响应:
{
"message": "GitHub indexing started in the background.",
"connector_id": 1,
"search_space_id": 1,
"indexing_from": "2024-09-08",
"indexing_to": "2025-09-08"
}
4.4 LLM配置API
4.4.1 创建LLM配置
POST /api/v1/llm-configs/
请求体:
{
"name": "GPT-4 Turbo",
"model_name": "gpt-4-turbo",
"api_base": "https://api.openai.com/v1",
"api_key": "sk_your_openai_api_key",
"temperature": 0.7,
"max_tokens": 4096,
"system_prompt": "You are a helpful AI assistant specializing in research."
}
响应:
{
"id": 1,
"name": "GPT-4 Turbo",
"model_name": "gpt-4-turbo",
"api_base": "https://api.openai.com/v1",
"api_key": "sk_your_openai_api_key",
"temperature": 0.7,
"max_tokens": 4096,
"system_prompt": "You are a helpful AI assistant specializing in research.",
"user_id": "550e8400-e29b-41d4-a716-446655440000"
}
4.5 搜索空间API
4.5.1 创建搜索空间
POST /api/v1/searchspaces/
请求体:
{
"name": "AI Research Space",
"description": "A search space for AI research papers and code"
}
响应:
{
"id": 1,
"name": "AI Research Space",
"description": "A search space for AI research papers and code",
"user_id": "550e8400-e29b-41d4-a716-446655440000",
"created_at": "2025-09-08T09:15:00Z"
}
5. API调用流程
5.1 典型应用流程
5.2 文档处理流程
6. 错误处理
6.1 常见错误码
| 状态码 | 描述 | 解决方案 |
|---|---|---|
| 400 | 无效请求 | 检查请求参数和格式 |
| 401 | 未授权 | 检查认证令牌是否有效 |
| 403 | 禁止访问 | 验证用户是否有权限访问资源 |
| 404 | 资源不存在 | 确认资源ID是否正确 |
| 409 | 资源冲突 | 检查是否违反唯一性约束(如重复的连接器类型) |
| 422 | 验证错误 | 检查请求数据是否符合验证规则 |
| 500 | 服务器错误 | 查看服务器日志获取详细信息 |
| 503 | 服务不可用 | 检查数据库连接或外部服务是否正常 |
6.2 错误响应格式
{
"detail": "A connector with type GITHUB_CONNECTOR already exists. Each user can have only one connector of each type."
}
7. 示例代码
7.1 Python SDK示例
import requests
class SurfSenseAPI:
def __init__(self, base_url, token):
self.base_url = base_url
self.headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json"
}
def create_search_space(self, name, description):
url = f"{self.base_url}/searchspaces/"
data = {
"name": name,
"description": description
}
response = requests.post(url, json=data, headers=self.headers)
response.raise_for_status()
return response.json()
def send_chat_message(self, search_space_id, message):
url = f"{self.base_url}/chat"
data = {
"messages": [{"role": "user", "content": message}],
"data": {"search_space_id": search_space_id}
}
response = requests.post(url, json=data, headers=self.headers, stream=True)
response.raise_for_status()
for chunk in response.iter_content(chunk_size=1024):
if chunk:
yield chunk.decode('utf-8')
# 使用示例
api = SurfSenseAPI("http://localhost:8000/api/v1", "your_token_here")
# 创建搜索空间
search_space = api.create_search_space("My AI Research", "Research on AI and machine learning")
search_space_id = search_space["id"]
# 发送聊天消息
print("AI Response:")
for chunk in api.send_chat_message(search_space_id, "Summarize recent advances in reinforcement learning"):
print(chunk, end='', flush=True)
7.2 JavaScript示例
class SurfSenseAPI {
constructor(baseUrl, token) {
this.baseUrl = baseUrl;
this.headers = {
"Authorization": `Bearer ${token}`,
"Content-Type": "application/json"
};
}
async createSearchSpace(name, description) {
const response = await fetch(`${this.baseUrl}/searchspaces/`, {
method: 'POST',
headers: this.headers,
body: JSON.stringify({ name, description })
});
if (!response.ok) throw new Error(`HTTP error! Status: ${response.status}`);
return response.json();
}
async *sendChatMessage(searchSpaceId, message) {
const response = await fetch(`${this.baseUrl}/chat`, {
method: 'POST',
headers: this.headers,
body: JSON.stringify({
messages: [{ role: "user", content: message }],
data: { search_space_id: searchSpaceId }
})
});
if (!response.ok) throw new Error(`HTTP error! Status: ${response.status}`);
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
yield decoder.decode(value);
}
}
}
// 使用示例
const api = new SurfSenseAPI("http://localhost:8000/api/v1", "your_token_here");
// 创建搜索空间
api.createSearchSpace("My AI Research", "Research on AI and machine learning")
.then(searchSpace => {
const searchSpaceId = searchSpace.id;
console.log("Created search space with ID:", searchSpaceId);
// 发送聊天消息
console.log("AI Response:");
const chatStream = api.sendChatMessage(
searchSpaceId,
"Summarize recent advances in reinforcement learning"
);
(async () => {
for await (const chunk of chatStream) {
document.getElementById("chat-output").innerHTML += chunk;
}
})();
});
8. 部署与配置
8.1 环境变量配置
创建.env文件配置必要的环境变量:
# 应用配置
SECRET_KEY=your_secret_key_here
NEXT_FRONTEND_URL=http://localhost:3000
AUTH_TYPE=GOOGLE
# 数据库配置
DATABASE_URL=postgresql+asyncpg://user:password@localhost/surfsense
# LLM配置
LLM_SERVICE=openai
OPENAI_API_KEY=sk_your_openai_api_key
8.2 服务器配置
通过环境变量配置服务器参数:
# 设置UVICORN服务器配置
export UVICORN_HOST=0.0.0.0
export UVICORN_PORT=8000
export UVICORN_LOG_LEVEL=info
export UVICORN_WORKERS=4
8.3 启动命令
# 使用UVICORN直接启动
uvicorn app.app:app --host 0.0.0.0 --port 8000 --reload
# 或使用Python脚本启动
python main.py
通过以上API,您可以构建功能丰富的自定义AI应用,利用SurfSense的强大能力连接各种外部数据源,实现智能文档处理和对话式AI交互。如需了解更多细节,请参考各模块的详细代码实现或提交issue获取支持。
创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考



