使用RAGAS评估RAG模型【自用】

fr271

已于 2024-10-28 16:00:00 修改

阅读量856

点赞数 7

文章标签： llama 人工智能语言模型

于 2024-10-28 11:56:21 首次发布

本文链接：https://blog.youkuaiyun.com/qq_61803559/article/details/143292074

版权

1. Answer relevance（答案相关性）

定义：
衡量生成的答案与提供的问题提示之间的相关性。答案如果缺乏完整性或者包含冗余信息，那么其得分将相对较低。
这一指标通过问题和答案的结合来进行计算，评分的范围通常在0到1之间，其中高分代表更好的相关性，答案确实是根据问题来。

示例：
问题：健康饮食的主要特点是什么？
低相关性答案：健康饮食对整体健康非常重要。
高相关性答案：健康饮食应包括各种水果、蔬菜、全麦食品、瘦肉和乳制品，为优化健康提供必要的营养素

RAGAS的方法：
使用LLM根据给定的答案生成n个潜在问题qi；
再用embedding模型获取所有问题的嵌入；
计算每个问题qi与原始问题q的相似性sim(q,qi)

举例：

data:

[
    {
        "question": "What is the capital of France?",
        "answer": "Paris.",
        "contexts": [
            "$1$. France is a developed country.",
            "$2$. Paris is the capital and largest city of the French Republic, as well as the political, economic, cultural and commercial center of France.",
            "$3$. The French Republic is referred to as France, the capital of Paris, located in Western Europe.",
            "$4$. France borders on Germany."
        ],
        "ground_truths": [
            "The capital of France is Paris."
        ]
    }
]

模型生成的3个问题及结果：
在这里插入图片描述