与Hive交互
SparkSQL可以采用内嵌Hive(spark开箱即用的hive),也可以采用外部Hive。企业开发中,通常采用外部Hive。
3.3.1 Linux中的交互
1)添加MySQL连接驱动到spark-yarn的jars目录
[atguigu@hadoop102 spark-yarn]$ cp /opt/software/mysql-connector-java-5.1.27-bin.jar /opt/module/spark-yarn/jars
2)添加hive-site.xml文件到spark-yarn的conf目录
[atguigu@hadoop102 spark-yarn]$ cp /opt/module/hive/conf/hive-site.xml /opt/module/spark-yarn/conf
3)启动spark-sql的客户端即可
[atguigu@hadoop102 spark-yarn]$ bin/spark-sql --master yarn
spark-sql (default)> show tables;
3.3.2 IDEA中的交互
1)添加依赖
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.3.1</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.27</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.12</artifactId>
<version>3.3.1</version>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.22</version>
</dependency>
</dependencies>
2)拷贝hive-site.xml到resources目录(如果需要操作Hadoop,需要拷贝hdfs-site.xml、core-site.xml、yarn-site.xml)
3)代码实现
package com.atguigu.sparksql;
import org.apache.spark.SparkConf;
import org.apache.spark.sql.SparkSession;
public class Test10_Hive {
public static void main(String[] args) {
System.setProperty("HADOOP_USER_NAME","atguigu");
//1. 创建配置对象
SparkConf conf = new SparkConf().setAppName("sparksql").setMaster("local[*]");
//2. 获取sparkSession
SparkSession spark = SparkSession.builder()
.enableHiveSupport()// 添加hive支持
.config(conf).getOrCreate();
//3. 编写代码
spark.sql("show tables").show();
spark.sql("create table user_info(name String,age bigint)");
spark.sql("insert into table user_info values('zhangsan',10)");
spark.sql("select * from user_info").show();
//4. 关闭sparkSession
spark.close();
}
}
825

被折叠的 条评论
为什么被折叠?



