- LegalAgentBench: Evaluating LLM Agents in Legal Domain
- INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent
- MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation
- Context-Aware Sentiment Forecasting via LLM-based Multi-Perspective Role-Playing Agents
- RAG-Critic: Leveraging Automated Critic-Guided Agentic Workflow for Retrieval Augmented Generation
- Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration
- BELLE: A Bi-Level Multi-Agent Reasoning Framework for Multi-Hop Question Answering
- Self-Taught Agentic Long Context Understanding
- OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
- GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent
- X-TURING: Towards an Enhanced and Efficient Turing Test for Long-Term Dialogue Agents
- AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection
- In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents
- SCALE: Towards Collaborative Content Analysis in Social Science with Large Language Model Agents and Human Intervention
- KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph
- GuideBench: Benchmarking Domain-Oriented Guideline Following for LLM Agents
- Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model
- SDPO: Segment-Level Direct Preference Optimization for Social Agents
- ReflecTool: Towards Reflection-Aware Tool-Augmented Clinical Agents
- MEDDxAgent: A Unified Modular Agent Framework for Explainable Automatic Differential Diagnosis
- Contextual Experience Replay for Continual Learning of Language Agents
- ACT: Knowledgeable Agents to Design and Perform Complex Tasks
- LLMs Can Simulate Standardized Patients via Agent Coevolution
- ChatSOP: An SOP-Guided MCTS Planning Framework for Controllable LLM Dialogue Agents
- Agent-RewardBench: Towards a Unified Benchmark for Reward Modeling across Perception, Planning, and Safety in Real-World Multimodal Agents
- Tunable LLM-based Proactive Recommendation Agent
- nvAgent: Automated Data Visualization from Natural Language via Collaborative Agent Workflow
- Substance over Style: Evaluating Proactive Conversational Coaching Agents
- CAMI: A Counselor Agent Supporting Motivational Interviewing through State Inference and Topic Exploration
- Beyond Frameworks: Unpacking Collaboration Strategies in Multi-Agent Systems
- AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration
- SOTOPIA-Ω: Dynamic Strategy Injection Learning and Social Instruction Following Evaluation for Social Agents
- OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction
- Controllable and Reliable Knowledge-Intensive Task Agents with Declarative GenieWorksheets
- GETReason: Enhancing Image Context Extraction through Hierarchical Multi-Agent Reasoning
- R2D2: Remembering, Replaying and Dynamic Decision Making with a Reflective Agentic Memory
- CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory
- Teaching Text Agents to Learn Sequential Decision Making from Failure
- EducationQ: Evaluating LLMs’ Teaching Capabilities Through Multi-Agent Dialogue Framework
ACL2025中agent论文
最新推荐文章于 2026-01-01 13:05:56 发布
部署运行你感兴趣的模型镜像
您可能感兴趣的与本文相关的镜像
TensorFlow-v2.9
TensorFlow
TensorFlow 是由Google Brain 团队开发的开源机器学习框架,广泛应用于深度学习研究和生产环境。 它提供了一个灵活的平台,用于构建和训练各种机器学习模型
225

被折叠的 条评论
为什么被折叠?



