Retrieval Augmentor 检索增强器
RetrievalAugmentor
是 RAG 管道的入口点。它负责使用从各种来源检索的相关 Content
来扩充 ChatMessage
。
可以在创建 AiService
期间指定 RetrievalAugmentor
的实例:
Assistant assistant = AiServices.builder(Assistant.class)
...
.retrievalAugmentor(retrievalAugmentor)
.build();
每次调用 AiService
时,都会调用指定的 RetrievalAugmentor
来扩充当前的 UserMessage
。
可以使用LangChain4j
中提供的 RetrievalAugmentor
的默认实现(DefaultRetrievalAugmentor
) 或 实现自定义实现。
Default Retrieval Augmentor 默认检索增强器
LangChain4j
提供了RetrievalAugmentor
接口的现成实现: DefaultRetrievalAugmentor
,它应该适合大多数 RAG
用例。
官方使用示例:
public class _04_Advanced_RAG_with_Metadata_Example {
/**
* Please refer to {@link Naive_RAG_Example} for a basic context.
* <p>
* Advanced RAG in LangChain4j is described here: https://github.com/langchain4j/langchain4j/pull/538
* <p>
* This example illustrates how to include document source and other metadata into the LLM prompt.
*/
public static void main(String[] args) {
Assistant assistant = createAssistant("documents/miles-of-smiles-terms-of-use.txt");
// Ask "What is the name of the file where cancellation policy is defined?".
// Observe how "file_name" metadata entry was injected into the prompt.
startConversationWith(assistant);
}
private static Assistant createAssistant(String documentPath) {
Document document = loadDocument(toPath(documentPath), new TextDocumentParser());
EmbeddingModel embeddingModel = new BgeSmallEnV15QuantizedEmbeddingModel();
EmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();
EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor.builder()
.documentSplitter(DocumentSplitters.recursive(300, 0))
.embeddingModel(embeddingModel)
.embeddingStore(embeddingStore)
.build();
ingestor.ingest(document);
ContentRetriever contentRetriever = EmbeddingStoreContentRetriever.builder()
.embeddingStore(embeddingStore)
.embeddingModel(embeddingModel)
.build();
// Each retrieved segment should inc