Spark SQL 可以通过JDBC从关系型数据库中读取数据的方式创建DataFrame,通过对DataFrame一系列的计算后,还可以将数据再写回关系型数据库中
从MySQL中加载数据(Spark Shell方式)
1.启动Spark Shell,必须指定mysql连接驱动jar包
/usr/local/spark-1.6.1/bin/spark-shell \
--master spark://minimaster:7077 \
--jars /usr/local/spark-1.6.1/mysql-connector-java-5.1.6-bin.jar \
--driver-class-path /usr/local/spark-1.6.1/mysql-connector-java-5.1.6-bin.jar
2.从mysql中加载数据
val jdbcDF = sqlContext.read.format("jdbc").options(Map("url" -> "jdbc:mysql://192.168.222.156:3306/bigdata", "driver" -> "com.mysql.jdbc.Driver", "dbtable" -> "location_info", "user" -> "root", "password" -> "root")).load()
3.执行查询
将数据写入到MySQL中
编写Spark SQL程序
package myRPC.qf.itcast.RDD
import java.util.Properties
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.{DataFrame, Row, SQLContext, types}
import org.apache.spark.sql.types.{IntegerType, StringType, StructField, StructType}
import org.apache.spark.{SparkConf, SparkContext}
object InsertDataToMySQL {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("InsertDataToMySQL")//.setMaster("local[2]")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
//获取数据
val linesRDD: RDD[Array[String]] = sc.textFile("hdfs://minimaster:9000/person2.txt").map(_.split(" "))
//StructType指定Schema[表结构]
val schema: StructType = StructType {
List(
StructField("name", StringType, false),
StructField("age", IntegerType, false)
)
}
//linesRDD[数据]映射到Row[表结构顺序]
val personRDD: RDD[Row] = linesRDD.map(p => Row(p(0),p(1).toInt))
//生成DataFrame
val personDF: DataFrame = sqlContext.createDataFrame(personRDD,schema)
//声明一个属性,用于封装请求MySQL的一些配置
val prop = new Properties()
prop.put("user","root")
prop.put("password","root")
prop.put("driver","com.mysql.jdbc.Driver")
//写入数据到MySQL
personDF.write.mode("append").jdbc("jdbc:mysql://minimaster/bigdata","person",prop)
sc.stop()
}
}
用maven将程序打包
将jar包提交到spark集群中(启动hdfs)
/usr/local/spark-1.6.1/bin/spark-submit \
--class myRPC.qf.itcast.RDD.InsertDataToMySQL \
--master spark://minimaster:7077 \
/home/wc.jar \
--jars /usr/local/spark-1.6.1/mysql-connector-java-5.1.6-bin.jar \
--driver-class-path /usr/local/spark-1.6.1/mysql-connector-java-5.1.6-bin.jar
运行结果:
可以使用第三方工具[navicat]查看