Spark读写postgresql

最新推荐文章于 2025-01-13 22:01:42 发布

独孤尚亮dugushangliang

最新推荐文章于 2025-01-13 22:01:42 发布

阅读量8.8k

点赞数 10

分类专栏： Spark 文章标签： Spark postgresql

本文链接：https://blog.youkuaiyun.com/weixin_40450867/article/details/102613275

版权

记录spark读写postgresql的操作
读写mysql同理，个别地方可能需要修改

1 连接数据库的两种方式

其中一为spark的读取方式，二为通过结合java读取
读取结果为DataFrame

读方法一

val jdbcDF = spark.read
  .format("jdbc")
  .option("url", "jdbc:postgresql://127.0.0.1:5432/geodb")
  .option("dbtable", "shuihu")
  .option("user", "postgres")
  .option("password", "postgres")
  .load()

读方法二

import java.util.Properties

val prop = new Properties()
prop.put("user", "postgres") //表示用户名
prop.put("password", "postgres") //表示密码
prop.put("driver","org.postgresql.Driver") //表示驱动程序
//读取
val df2=spark.read.jdbc(url="jdbc:postgresql://127.0.0.1:5432/geodb",table="shuihu3",prop)

写方法一

append模式，当已有数据表时在原有的基础上追加，如果表不存在则会自动创建表

定义字段类型时当新建表时有效，如果表已经存在则已原有的为准
新建数据表时最好指定字段的长度范围

shuihuDF.write
  .mode("append")   //如果不用追加模式则报错表已经存在，换个不存在的表会自动创建并写入数据
  .format("jdbc")
  .option("createTableColumnTypes", "id int,xingzuo CHAR(64), chuohao char(64),name char(64)")
  .option("url", "jdbc:postgresql://127.0.0.1:5432/geodb")
  .option("dbtable", "shuihu2")
  .option("user", "postgres")
  .option("password", "postgres")
  .save()

写方法二

同样需要 import java.util.Properties

val prop = new Properties()
prop.put("user", "postgres") //表示用户名
prop.put("password", "postgres") //表示密码
prop.put("driver","org.postgresql.Driver"

最低0.47元/天解锁文章