聚类有如下特征:
如果列标为,
1,PREDICT,预测可选;
2, INPUT,预测不可选;
3,PREDICT ONLY,TRAINING忽略。
一共有四种算法:
1,SCALEABLE EM
2, NO SCALEABLE EM
3, SCALEABLE KM
4, NO SCALEABLE KM
下面一个例子比较四种算法
create mining structure [Clustering Method]
(
[Age] long discretized(automatic,10),
[Bike Buyer] long discrete,
[Commute Distance] text discrete,
[Customer Key] long key,
[Education] text discrete,
[Gender] text discrete,
[House Owner Flag] text discrete,
[Marital Status] text discrete,
[Number Cars Owned] long discrete,
[Number Children At Home] long discrete,
[Occupation] text discrete,
[Region] text discrete,
[Total Children] long discrete,
[Yearly Income] double continuous
)
alter mining structure [Clustering Method]
add mining model [Clutering_SEM]
using microsoft_clustering
(CLUSTERING_METHOD = 1)
alter mining structure [Clustering Method]
add mining model [Clutering_NSEM]
using mic
SQL SERVER 数据挖掘聚类算法对比分析

本文探讨了SQL SERVER数据挖掘中的聚类算法,包括SCALEABLE EM、NO SCALEABLE EM、SCALEABLE KM和NO SCALEABLE KM四种,并通过案例对比了它们的性能。根据测试结果,NO SCALEABLE EM算法在Case Likelihood指标上表现最优。
最低0.47元/天 解锁文章
513





