scikit-learn(project中用的相对较多的模型介绍):1.14. Semi-Supervised

标签传播算法(Label Propagation)适用于分类和回归问题,能利用未标记数据优化模型,提高泛化能力。本文介绍两种模型:LabelPropagation和LabelSpreading,探讨它们在相似性图构建上的差异及应用场景。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

參考:http://scikit-learn.org/stable/modules/label_propagation.html



The semi-supervised estimators insklearn.semi_supervised are able to make use of this additional unlabeled data to better capture the shape of the underlying data distribution and generalize better to new samples. These algorithms can perform well when we have a very small amount of labeled points and a large amount of unlabeled points.


Unlabeled entries in yIt is important to assign an identifier to unlabeled points along with the labeled data when training the model with the fit method. The identifier that this implementation uses is the integer value -1.




标签传播算法(Label propagation):

特点:

1)分类和回归问题均适用

2)能够使用kernel methods将数据映射到其它维度空间。

scikit-learn提供了两个标签传播模型:LabelPropagation and LabelSpreadingBoth work by constructing a similarity graph over all items in the input dataset.


两者差别在于:对原始label分布的图模型和夹紧效果clamping effect的similarity matrix的改动程度。所谓的夹紧效果,就是同意两个模型change true ground labeled data的weight。


LabelPropagation适用“硬夹紧(hard clamping),即alpha=1。

假设令alpha=0.8,这意味着我们将保留原有的80%的标签分布。但该算法的信任的分布度也会有20%的影响。

LabelPropagation使用从没有不论什么改动的原始数据中构造的similarity matrix。而LabelSpreading最小化一个带有正规项的loss function,从而对noise鲁棒。


标签传播模型有两个内置的kernel methods,不同的kernel对算法的可扩展性和性能都有影响:


The RBF kernel will produce a fully connected graph which is represented in memory by a dense matrix. This matrix may be very large and combined with the cost of performing a full matrix multiplication calculation for each iteration of the algorithm can lead to prohibitively long running times. On the other hand, the KNN kernel will produce a much more memory-friendly sparse matrix which can drastically reduce running times.



Examples


转载于:https://www.cnblogs.com/jzssuanfa/p/7290550.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值