spark-ml,gbdt scla实现 普通训练集
读取数据
val conf = new SparkConf().setAppName("gbdt_ms").setMaster("local[*]")
val spark = SparkSession.builder().config(conf).enableHiveSupport().getOrCreate()
//读取原始数据
val parsedRDD =spark.read.textFile("D:\\gbdt\\testSet.txt").rdd.map(_.split(" ")).map(eachRow => {
val a = eachRow.map(x => x.toDouble)
(a(0),a(1),a(2))
})
val df =spark.createDataFrame(parsedRDD).toDF(
"f0","f1","label").cache()
具体操作,
spark-ml,gbdt scla实现 libsvm训练集
读取数据
val sparkConf = new SparkConf().setAppName("gbdt").setMaster("local[*]")
val spark = SparkSession.builder().config(sparkConf).enableHiveSupport().getOrCreate()
//加载并分析数据文件,将其转换为DataFrame
val data = spark.read.format("libsvm").load("D:\\gbdt\\sample_libsvm_data.txt")
具体操作,
代码附加训练集下载链接
https://download.youkuaiyun.com/download/qq_37267359/12527659