Development and Testing of Retrieval Augmented Generation in Large Language Models

828 篇文章

已下架不支持订阅

本文介绍了一种针对医疗保健的LLM-RAG管道,旨在改善大型语言模型在术前医学应用中的表现。通过利用检索增强生成,模型在15-20秒内生成答案,准确率达到91.4%,与人类生成的指导相当,并展现出高效、准确和安全的特点。

本文是LLM系列文章,针对《Development and Testing of Retrieval Augmented Generation in Large
Language Models A Case Study Report》的翻译。

大型语言模型中检索增强生成的开发与测试——一个案例研究报告

摘要

目的:大型语言模型(LLM)在医学应用中具有重要的前景。然而,它们的实际实施往往无法结合当前临床专业和任务的基于指南的知识。此外,像微调这样的传统精度提高方法也带来了相当大的计算挑战。
检索增强生成(RAG)是在LLM中定制领域知识的一种很有前途的方法,特别适合医疗保健实施中的需求。本案例研究介绍了为医疗保健量身定制的LLM-RAG管道的开发和评估,特别关注术前医学。LLM-RAG系统产生的反应的准确性和安全性被评估为主要终点。
方法:我们使用35种术前指南开发了LLM-RAG模型,并针对人类产生的反应进行了测试,共评估了1260种反应(336种人类产生的,336种LLM产生的,588种LLM-RAG产生的)。
RAG过程包括使用基于Python的框架(如LangChain和Llamaindex)将临床文档转换为文本,并将这些文本处理为块以进行嵌入和检索。向量存储技术和选定的嵌入模型来优化数据检索,使用Pinecone进行向量存储,维度为1536,余弦相似性用于损失度量。评估LLM,包括GPT3.5、GPT4.0、Llama2-7B、骆驼2-13B及其LLM-RAG对应物。
我们使用14个未识别的临床场景对该系统进行了评估,重点关注术前指导的六个关键方面。答复的正确性是根据既定准则和专家小组审查确定的。初级医生提供的人工生成的答案被用作比较。使用Cohen的H检验和卡方检验进行比较分析。
结果:LLM-RAG模型在15-20秒的平均时间内生成答案,明显快于人类通常需要的10分钟。在基本LLM中,GPT4.0的准确率最高,为80.1%。这一准确率进一步提

已下架不支持订阅

### Retrieval Augmented Generation in NLP Models and Applications #### Definition and Concept Retrieval Augmented Generation (RAG) is a method that integrates retrieval-based and generation-based approaches, particularly useful for tasks requiring the reproduction of specific information. This technique leverages content pertinent to solving an input task as context, enabling more accurate knowledge retrieval which subsequently aids in generating better responses through iterative refinement processes [^1]. In addition, RAG significantly reduces model hallucination by grounding generated outputs on factual data retrieved from external sources or databases [^2]. #### General Process The general process of applying RAG within Natural Language Processing (NLP) involves several key steps: - **Contextual Information Collection**: Gathering relevant documents or pieces of text related to the query. - **Knowledge Retrieval**: Using these collected contexts to find precise facts or statements needed for response formulation. - **Response Generation**: Generating coherent answers based on both the original prompt and the retrieved information. This approach ensures that the final output not only addresses user queries accurately but also remains grounded in verified facts rather than potentially unreliable internal representations learned during training [^1]. #### Implementation Example Below demonstrates how one might implement a simple version of this concept using Python code snippets alongside popular libraries like Hugging Face Transformers for natural language understanding and FAISS for efficient similarity search among large document collections. ```python from transformers import pipeline import faiss import numpy as np def create_index(documents): """Creates an index structure suitable for fast nearest neighbor searches.""" embeddings = get_embeddings_for_documents(documents) dimensionality = len(embeddings[0]) # Initialize Faiss Index Flat L2 distance metric index = faiss.IndexFlatL2(dimensionality) # Add vectors into our searchable space index.add(np.array(embeddings).astype('float32')) return index def retrieve_relevant_contexts(query_embedding, index, top_k=5): """Finds most similar items given a query vector against indexed dataset""" distances, indices = index.search(np.array([query_embedding]).astype('float32'), k=top_k) return [(distances[i], idx) for i,idx in enumerate(indices.flatten())] # Assume `get_embeddings` function exists elsewhere... index = create_index(corpus_of_texts) qa_pipeline = pipeline("question-answering") for doc_id, score in retrieve_relevant_contexts(user_query_vector, index): answer = qa_pipeline(question=user_input, context=corpus_of_texts[doc_id]['text']) print(f"Answer found with relevance {score}: ",answer['answer']) ``` --related questions-- 1. What are some common challenges faced when implementing RAG systems? 2. Can you provide examples where RAG has been successfully applied outside traditional QA settings? 3. How does Graph Retrieval-Augmented Generation differ from standard RAG methods described here? 4. In what ways can integrating knowledge graphs enhance performance in RAG frameworks?
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UnknownBody

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值