LabeledPoint(y[i], X[i])

最新推荐文章于 2022-09-05 16:51:37 发布

转载最新推荐文章于 2022-09-05 16:51:37 发布 · 610 阅读

文章标签：

#spark rdd labelpoint

本文介绍了Apache Spark中LabeledPoint的概念及使用方法，包括如何创建带有正标签和密集特征向量的标记点，以及带有负标签和稀疏特征向量的标记点。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

复制来自http://spark.apache.org/docs/latest/mllib-data-types.html
A labeled point is represented by LabeledPoint.

Refer to the LabeledPoint Python docs for more details on the API.

from pyspark.mllib.linalg import SparseVector
from pyspark.mllib.regression import LabeledPoint

# Create a labeled point with a positive label and a dense feature vector.创建带有正标签和密集特征向量的标记点。
pos = LabeledPoint(1.0, [1.0, 0.0, 3.0])

# Create a labeled point with a negative label and a sparse feature vector.
neg = LabeledPoint(0.0, SparseVector(3, [0, 2], [1.0, 3.0]))