最近完成了Sequoiadb与Spark的对接,为了便于之后查阅,记录如下
Sequoiadb 版本: 1.12
spark版本:1.3.1
Sequoiadb与spark对接步骤如下:
1. 配置hive-site.xml(非必须)
<property> <name>hive.aux.jars.path</name> <value>file:///ocsdev/hadoop/apache-hive-1.1.0-bin/lib/hive-sequoiadb-apache.jar,file:///ocsdev/hadoop/apache-hive-1.1.0-bin/lib/sequoiadb.jar</value> <description>Sequoiadb store handler jar file</description> </property> |
2. 配置SPARK_CLASSPATH
export SPARK_CLASSPATH=/path/to/spark/lib/sequoiadb-driver-1.12.jar:/path/to/spark/lib/lib/spark-sequoiadb_2.10-1.12.jar:/ocsdev/hadoop/spark-1.3.1-bin-hadoop2.6/lib://path/to/spark/lib/mysql-connector-java-5.1.5-bin.jar |
3. 将sequoiadb-driver-1.12.jar、spark-sequoiadb_2.10-1.12.jar拷贝到spark的lib目录下;
4. hive-site.xml配置元数据存储方式
<property> <name>hive.metastore.local</name> <value>true</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://192.168.0.103:3306/hive?characterEncoding=UTF-8</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> </property> |
5. 创建Sequoiadb集合映射关系;
./spark_sql
>CREATE table lw_test_sdb (id int, r5 double) using com.sequoiadb.spark OPTIONS ( host '192.168.0.103:11810,192.168.0.104:11810,192.168.0.102:11810', collectionspace 'hj', collection 'aws_min',username '',password '');
6. 查询数据
>select * from lw_test_sdb
NULL 0.0
23 23.4
12 34.5
Time taken: 0.825 seconds, Fetched 3 row(s)