val username = “root”
val password = “huangchao”
val drive = “com.mysql.jdbc.Driver”
val url = “jdbc:mysql://slave2:3306/travel”
var connection:Connection = null
val sparkConf = new SparkConf().setAppName("HBaseReadTest").setMaster("local[2]")
val sc = new SparkContext(sparkConf)
val value: RDD[(String, String)] = sc.parallelize(Seq(("rk1002", "11111"),("a", "11111")))
classOf[com.mysql.jdbc.Driver]
connection = DriverManager.getConnection(url,username,password)
val statement = connection.createStatement()
val result = statement.executeQuery("select row,name from student")
val stringToString: mutable.Map[String, String] = collection.mutable.Map[String, String]()
val tuplesToStringToString: Seq[(String, String)] => Map[String, String] = Map[String, String]
while(result.next()){
val id: String = result.getString("name")
val str: String = result.getString("row")
stringToString.put(id,str)
}
//将Map转成RDD
val value1: RDD[(String, String)] = sc.parallelize(stringToString.toList)
//只能取完全相同的交集
val value2: RDD[(String, String)] = value1.intersection(value)
value2.foreach(println)
// stringToString.foreach(print)
connection.close()
本文展示如何使用Apache Spark读取MySQL数据库中的数据,并与RDD进行交集操作。通过建立数据库连接,执行SQL查询,将结果转换为Map并进一步转化为RDD,最后与预先定义的RDD进行交集操作。
1505

被折叠的 条评论
为什么被折叠?



