判别模型(Discriminative Model)和生成模型(Generative Model)

最新推荐文章于 2024-10-11 12:02:37 发布

转载最新推荐文章于 2024-10-11 12:02:37 发布 · 827 阅读

文章标签：

#classification #class #reference #input

学术研究专栏收录该内容

11 篇文章

订阅专栏

本文深入探讨了判别模型与生成模型在概率分布上的差异，解释了它们在分类任务中的应用，并讨论了各自的优势与局限性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

第一篇：

Let's say you have input data x and you want to classify the data into labels y. A generative model learns the joint probability distribution p(x,y) and adiscriminative model learns the conditional probability distribution p(y|x) - which you should read as 'the probability of y given x'.

Here's a really simple example. Suppose you have the following data in the form (x,y):

       (1,0), (1,0), (2,0), (2, 1)

p(x,y) is

             y=0   y=1
            -----------
       x=1 | 1/2   0
       x=2 | 1/4   1/4

p(y|x) is

             y=0   y=1
            -----------
       x=1 | 1     0
       x=2 | 1/2   1/2

If you take a few minutes to stare at those two matrices, you will understand the difference between the two probability distributions.

The distribution p(y|x) is the natural distribution for classifying a given example x into a class y, which is why algorithms that model this directly are calleddiscriminative algorithms. Generative algorithms model p(x,y), which can be tranformed into p(y|x) by applying Bayes rule and then used for classification. However, the distribution p(x,y) can also be used for other purposes. For example you could use p(x,y) to generate likely (x,y) pairs.

From the description above you might be thinking that generative models are more generally useful and therefore better, but it's not as simple as that. This paper is a very popular reference on the subject of discriminative vs. generative classifiers, but it's pretty heavy going. The overall gist is that discriminative models generally outperform generative models in classification tasks.

第二篇：

判别模型Discriminative Model，又可以称为条件模型，或条件概率模型。估计的是条件概率分布(conditional distribution)， p(class|context)。
生成模型Generative Model，又叫产生式模型。估计的是联合概率分布（joint probability distribution），p(class, context)=p(class|context)*p(context)。