Agentic实战案例：构建智能新闻摘要Agent-优快云博客

Agentic实战案例：构建智能新闻摘要Agent

【免费下载链接】agentic AI agent stdlib that works with any LLM and TypeScript AI SDK. 项目地址: https://gitcode.com/GitHub_Trending/ag/agentic

本文详细介绍了如何使用Agentic框架构建一个智能新闻摘要Agent，重点阐述了多数据源新闻聚合方案的设计与实现。文章涵盖了Perigon与Serper工具的协同使用策略、摘要生成与结果格式化的技术细节，以及生产环境下的部署与监控最佳实践，为开发者提供了完整的实战指南。

多数据源新闻聚合方案设计

在构建智能新闻摘要Agent时，多数据源聚合是确保新闻内容全面性、时效性和多样性的关键设计要素。Agentic框架提供了强大的多源新闻集成能力，通过标准化的AI函数接口实现统一的数据访问层。

数据源架构设计

Agentic的多数据源新闻聚合采用分层架构设计，主要包括以下核心组件：

mermaid

核心数据源集成

Perigon专业新闻API集成

Perigon作为专业的新闻聚合平台，提供了丰富的新闻搜索和过滤能力：

import { PerigonClient } from '@agentic/perigon'

const perigon = new PerigonClient({
  apiKey: process.env.PERIGON_API_KEY
})

// 多维度新闻搜索配置
const newsSearchConfig = {
  q: 'artificial intelligence',
  category: 'Tech',
  from: '2024-01-01',
  to: '2024-01-31',
  sourceGroup: 'top50tech',
  size: 20,
  sortBy: 'date'
}

Perigon支持的高级过滤功能包括：

过滤维度	功能描述	示例值
分类过滤	按新闻主题分类	Tech, Business
时间范围	精确时间筛选	2024-01-01 至 2024-01-31
来源质量	按媒体信誉度	top50tech, top100
内容标签	排除低质量内容	排除Opinion, Paid News
地理位置	地域相关性	国家代码、城市名称

Serper Google搜索集成

Serper提供实时的Google搜索能力，确保获取最新的网络内容：

import { SerperClient } from '@agentic/serper'

const serper = new SerperClient({
  apiKey: process.env.SERPER_API_KEY
})

// 多类型搜索配置
const searchConfig = {
  q: 'AI breakthrough latest news',
  type: 'news',
  num: 15,
  gl: 'us',
  hl: 'en'
}

Hacker News技术社区集成

对于技术类新闻，Hacker News提供了社区驱动的优质内容：

import { HackerNewsClient } from '@agentic/hacker-news'

const hn = new HackerNewsClient()

// 获取技术社区热门内容
async function getTechNews() {
  const topStories = await hn.getTopStories()
  const storyDetails = await Promise.all(
    topStories.slice(0, 10).map(id => hn.getItem(id))
  )
  return storyDetails.filter(story => story.type === 'story')
}

数据聚合策略

时间权重算法

为确保新闻的时效性，采用基于发布时间的权重算法：

function calculateTimeWeight(publishTime: Date): number {
  const now = new Date()
  const hoursDiff = (now.getTime() - publishTime.getTime()) / (1000 * 60 * 60)
  
  // 指数衰减权重算法
  return Math.exp(-hoursDiff / 24) * 100
}

来源可信度评分

建立多维度来源可信度评估体系：

评估维度	权重	评分标准
媒体声誉	30%	权威媒体、行业认可度
内容质量	25%	原创性、深度分析
时效性	20%	发布速度、更新频率
社区反馈	15%	分享数、评论质量
历史准确度	10%	过往报道准确性

去重与聚类算法

采用智能内容去重和故事聚类技术：

mermaid

统一数据接口设计

Agentic通过AIFunctionSet提供统一的函数调用接口：

import { createDexterFunctions } from '@agentic/dexter'
import { PerigonClient, SerperClient } from '@agentic/stdlib'

// 创建多源新闻函数集
const perigon = new PerigonClient()
const serper = new SerperClient()

const newsFunctions = createDexterFunctions(
  perigon.functions.pick('search_news_stories'),
  serper.functions.pick('serper_google_search')
)

// 智能函数调用示例
const multiSourceSearch = {
  functions: newsFunctions,
  systemMessage: `你是一个新闻聚合专家，请根据用户查询从多个数据源获取最新新闻，
                  并进行智能整合和去重处理。优先选择权威来源，确保时效性和准确性。`
}

性能优化策略

并发请求处理

采用并行数据获取策略提升响应速度：

async function fetchMultiSourceNews(query: string) {
  const [perigonResults, serperResults, hnResults] = await Promise.all([
    perigon.searchStories({ q: query, size: 10 }),
    serper.searchNews(query),
    getTechNews() // Hacker News最新内容
  ])
  
  return { perigonResults, serperResults, hnResults }
}

缓存机制设计

实现多层次缓存策略优化性能：

缓存层级	缓存时间	适用场景
内存缓存	5分钟	热点新闻、频繁查询
分布式缓存	1小时	常规新闻内容
持久化存储	24小时	历史新闻归档

错误处理与降级方案

建立健壮的错误处理机制确保服务稳定性：

class NewsAggregationService {
  async getNewsWithFallback(query: string) {
    try {
      return await this.fetchFromPrimarySource(query)
    } catch (error) {
      console.warn('Primary source failed, trying fallback:', error)
      return await this.fetchFromSecondarySource(query)
    }
  }
  
  private async fetchFromPrimarySource(query: string) {
    // Perigon作为主要数据源
    return perigon.searchStories({ q: query })
  }
  
  private async fetchFromSecondarySource(query: string) {
    // Serper作为备用数据源
    return serper.searchNews(query)
  }
}

通过这种多数据源聚合方案设计，智能新闻摘要Agent能够从多个权威渠道获取新闻内容，确保信息的全面性、时效性和准确性，同时通过智能去重和优先级排序提供最优的用户体验。

Perigon与Serper工具协同使用

在构建智能新闻摘要Agent时，Perigon和Serper的协同使用是实现高效新闻信息检索和智能分析的关键策略。这两个工具分别专注于新闻内容深度挖掘和网络信息广度搜索，它们的结合能够为Agent提供全面而精准的信息获取能力。

工具特性对比与互补优势

Perigon和Serper在功能定位上具有明显的互补性，下表展示了它们的主要特性对比：

特性维度	Perigon (新闻API)	Serper (Google搜索API)
数据来源	专业新闻媒体和出版物	全网公开网页内容
内容类型	结构化新闻文章和故事	多样化网络内容
搜索精度	高精度新闻过滤	广度优先搜索
时效性	实时新闻流	近实时网络内容
结构化程度	高度结构化元数据	半结构化搜索结果
适用场景	深度新闻分析	广度信息检索

协同工作流程设计

Perigon与Serper的协同使用遵循一个精心设计的工作流程，确保信息检索的效率和准确性：

mermaid

技术实现代码示例

在实际的Agent实现中，我们可以通过以下方式整合Perigon和Serper：

import { PerigonClient, SerperClient } from '@agentic/stdlib'
import { createAIFunctionSet } from '@agentic/core'

class NewsSummaryAgent {
  private perigon: PerigonClient
  private serper: SerperClient
  private functions: AIFunctionSet

  constructor() {
    this.perigon = new PerigonClient()
    this.serper = new SerperClient()
    
    // 创建功能集合，精心选择最适合新闻摘要的功能
    this.functions = createAIFunctionSet(
      this.perigon.functions.pick(
        'search_news_stories',
        'search_news_articles'
      ),
      this.serper.functions.pick('serper_google_search')
    )
  }

  async summarizeNews(topic: string, options: NewsSummaryOptions = {}) {
    const { usePerigon = true, useSerper = true, maxResults = 10 } = options
    
    const results = []
    
    if (usePerigon) {
      // Perigon提供深度新闻内容
      const perigonResults = await this.perigon.searchStories({
        q: topic,
        size: Math.floor(maxResults * 0.6), // 60%来自专业新闻
        sortBy: 'date',
        from: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000).toISOString() // 最近7天
      })
      results.push(...perigonResults)
    }
    
    if (useSerper) {
      // Serper提供广度网络信息
      const serperResults = await this.serper.search({
        q: topic,
        num: Math.floor(maxResults * 0.4), // 40%来自网络搜索
        type: 'news'
      })
      results.push(...this.transformSerperResults(serperResults))
    }
    
    return this.analyzeAndSummarize(results, topic)
  }

  private transformSerperResults(serperResults: any): NewsItem[] {
    // 将Serper结果转换为统一的新闻格式
    return serperResults.news?.map((item: any) => ({
      title: item.title,
      url: item.link,
      snippet: item.snippet,
      source: item.source,
      publishedDate: item.date,
      type: 'web'
    })) || []
  }

  private async analyzeAndSummarize(items: NewsItem[], topic: string) {
    // 实现智能分析和摘要生成逻辑
    const uniqueItems = this.deduplicateItems(items)
    const rankedItems = this.rankByRelevance(uniqueItems, topic)
    
    return {
      summary: await this.generateSummary(rankedItems),
      sources: rankedItems.slice(0, 10),
      totalResults: uniqueItems.length,
      generatedAt: new Date().toISOString()
    }
  }
}

智能路由策略

为了实现最优的搜索效果，我们设计了基于查询类型的智能路由策略：

interface SearchRouter {
  determineSearchStrategy(query: string): SearchStrategy
}

class IntelligentSearchRouter implements SearchRouter {
  private newsKeywords = [
    'news', 'article', 'report', 'coverage', 'update', 
    'breaking', 'latest', 'recent', 'today'
  ]
  
  private perigonDomains = [
    'tech', 'business', 'finance', 
    'health', 'science', 'sports'
  ]

  determineSearchStrategy(query: string): SearchStrategy {
    const lowerQuery = query.toLowerCase()
    
    // 检测新闻相关查询
    const isNewsQuery = this.newsKeywords.some(keyword => 
      lowerQuery.includes(keyword)
    )
    
    // 检测专业领域查询
    const isDomainSpecific = this.perigonDomains.some(domain =>
      lowerQuery.includes(domain)
    )
    
    if (isNewsQuery && isDomainSpecific) {
      return { perigon: 0.8, serper: 0.2 } // 主要使用Perigon
    } else if (isNewsQuery) {
      return { perigon: 0.6, serper: 0.4 } // 平衡使用
    } else {
      return { perigon: 0.3, serper: 0.7 } // 主要使用Serper
    }
  }
}

性能优化与缓存机制

为了提升响应速度和减少API调用成本，我们实现了多层缓存策略：

mermaid

对应的缓存实现代码：

class NewsCacheManager {
  private memoryCache = new Map<string, CachedNews>()
  private diskCache: CacheStorage
  
  async getCachedNews(query: string, ttl: number = 300000): Promise<NewsItem[] | null> {
    const cacheKey = this.generateCacheKey(query)
    
    // 内存缓存检查
    const memoryCached = this.memoryCache.get(cacheKey)
    if (memoryCached && Date.now() - memoryCached.timestamp < ttl) {
      return memoryCached.data
    }
    
    // 磁盘缓存检查
    const diskCached = await this.diskCache.match(cacheKey)
    if (diskCached) {
      const data = await diskCached.json()
      this.memoryCache.set(cacheKey, {
        data,
        timestamp: Date.now()
      })
      return data
    }
    
    return null
  }
  
  async cacheNews(query: string, data: NewsItem[]): Promise<void> {
    const cacheKey = this.generateCacheKey(query)
    const cacheItem = {
      data,
      timestamp: Date.now()
    }
    
    // 更新内存缓存
    this.memoryCache.set(cacheKey, cacheItem)
    
    // 更新磁盘缓存
    await this.diskCache.put(cacheKey, new Response(JSON.stringify(data)))
  }
}

错误处理与降级策略

在分布式系统中，健壮的错误处理机制至关重要：

class ResilientNewsSearch {
  async searchWithFallback(query: string, options: SearchOptions) {
    try {
      // 首选Perigon进行搜索
      return await this.perigon.searchStories({
        q: query,
        size: options.maxResults,
        sortBy: 'relevance'
      })
    } catch (perigonError) {
      console.warn('Perigon搜索失败，降级到Serper:', perigonError)
      
      try {
        // 降级到Serper
        const serperResults = await this.serper.search({
          q: query,
          num: options.maxResults,
          type: 'news'
        })
        return this.transformSerperResults(serperResults)
      } catch (serperError) {
        console.error('所有搜索服务均失败:', serperError)
        throw new Error('新闻搜索服务暂时不可用')
      }
    }
  }
}

监控与指标收集

为了确保系统的可靠性和持续优化，我们实现了全面的监控体系：

class NewsSearchMonitor {
  private metrics = {
    perigonSuccess: 0,
    perigonFailure: 0,
    serperSuccess: 0,
    serperFailure: 0,
    cacheHit: 0,
    cacheMiss: 0,
    responseTimes: [] as number[]
  }

  recordSearchMetrics(
    provider: 'perigon' | 'serper',
    success: boolean,
    responseTime: number
  ) {
    if (provider === 'perigon') {
      success ? this.metrics.perigonSuccess++ : this.metrics.perigonFailure++
    } else {
      success ? this.metrics.serperSuccess++ : this.metrics.serperFailure++
    }
    
    this.metrics.responseTimes.push(responseTime)
    
    // 定期上报指标到监控系统
    if (this.metrics.responseTimes.length >= 100) {
      this.reportMetrics()
    }
  }

  private reportMetrics() {
    const avgResponseTime = this.metrics.responseTimes.reduce(
      (sum, time) => sum + time, 0
    ) / this.metrics.responseTimes.length
    
    console.log('搜索性能指标:', {
      perigonSuccessRate: this.metrics.perigonSuccess / 
        (this.metrics.perigonSuccess + this.metrics.perigonFailure || 1),
      serperSuccessRate: this.metrics.serperSuccess / 
        (this.metrics.serperSuccess + this.metrics.serperFailure || 1

【免费下载链接】agentic AI agent stdlib that works with any LLM and TypeScript AI SDK. 项目地址: https://gitcode.com/GitHub_Trending/ag/agentic

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考