方法一
由于Spark中构造Dataframe 即使是数值类型的数据 也需要以String形式进行创建,
var data = Seq(
("0.1","0"),
("0.15","0"),
("0.8","1"),
("1.0","1")
).toDF("predict","label")
+-------+-----+
|predict|label|
+-------+-----+
| 0.1| 0|
| 0.15| 0|
| 0.8| 1|
| 1.0| 1|
+-------+-----+
方法二
val data = Array(("1", "2", "3", "4", "5"), ("6", "7", "8", "9", "10"))
val df = spark.createDataFrame(data).toDF("col1", "col2", "col3", "col4", "col5")
+----+----+----+----+----+
|col1|col2|col3|col4|col5|
+----+----+----+----+----+
| 1| 2| 3| 4| 5|
| 6| 7| 8| 9| 10|
+----+----+----+----+----+
方法三:根据数据写入 (方法二的升级版)
var b = Array[String]("a","b","c")
var c = Array[String]("1","10","100")
var result: ArrayBuffer[(String, String)] = ArrayBuffer[(String,String)]()
for(j <- 0 until b.size-1){
result += ((b(j), c(j)))
}
val df = spark.createDataFrame(result.toArray).toDF("name", "age")
df.show()