1,添加elasticsearch-hadoop-hive-2.1.2.jar到Hive。Hive添加第三方包,查看:http://blog.youkuaiyun.com/qianshangding0708/article/details/50381966
2,在hive中建立Elasticsearch外表:
@Test
public void testESTable() {
try {
HiveHelper
.excuteNonQuery("CREATE EXTERNAL TABLE es_user(id String ,name String ,age int ,create_date String) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource' = 'es_hive/user_{create_date}','es.index.auto.create' = 'true','es.nodes' = '10.0.1.75:9200,10.0.1.76:9200,10.0.1.77:9200')");
} catch (Exception e) {
e.printStackTrace();
}
}
为了让SQL语句看的清晰点,再贴一次SQL语句:
CREATE EXTERNAL TABLE es_user (
id String,
NAME String,
age INT,
create_date String
) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES (
'es.resource' = 'es_hive/user_{create_date}',
'es.index.auto.create' = 'true',
'es.nodes' = '10.0.1.75:9200,10.0.1.76:9200,10.0.1.77:9200'
);
es.index.auto.create:设置是否自动创建索引
es.nodes:Elasticsearch集群地址。
3,创建Hive表
@Test
public void testHiveTable() {
try {
HiveHelper
.excuteNonQuery("CREATE TABLE IF NOT EXISTS hive_user(id String ,name String ,age int) PARTITIONED BY (create_date String) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE");
} catch (Exception e) {
e.printStackTrace();
}
}
SQL语句:
CREATE TABLE IF NOT EXISTS hive_user(
id String ,
name String ,
age int
) PARTITIONED BY (create_date String)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
4,上传数据文件,并将文件导入到hive表(hive_user)
数据原文:kkk.txt
1,fish1,1
2,fish2,2
3,fish3,3
4,fish4,4
5,fish5,5
6,fish6,6
7,fish7,7
8,fish8,8
9,fish9,9把kkk.txt上传到HDFS的/fish/hive/目录
@Test
public void testLoadHiveTable() {
try {
HiveHelper
.excuteNonQuery("LOAD DATA INPATH '/fish/hive/kkk.txt' INTO TABLE hive_user PARTITION(CREATE_DATE='2015-12-22')");
} catch (Exception e) {
e.printStackTrace();
}
}Load到Hive表。
查看hive_user表:
hive> select * from hive_user;
OK
1 fish1 1 2015-12-22
2 fish2 2 2015-12-22
3 fish3 3 2015-12-22
4 fish4 4 2015-12-22
5 fish5 5 2015-12-22
6 fish6 6 2015-12-22
7 fish7 7 2015-12-22
8 fish8 8 2015-12-22
9 fish9 9 2015-12-22
Time taken: 0.041 seconds, Fetched: 9 row(s)
OK,数据已经Load到Hive。
5,将Hive表的数据插入到Elasticsearch
@Test
public void testInsertElasticSearch() {
try {
HiveHelper
.excuteNonQuery("INSERT OVERWRITE TABLE es_user SELECT s.id, s.name, s.age, s.create_date FROM hive_user s where s.create_date='2015-12-22'");
} catch (Exception e) {
e.printStackTrace();
}
}查看Elasticsearch:
数据已经成功上传到Elasticsearch。
更多细节,参考:https://www.elastic.co/guide/en/elasticsearch/hadoop/current/hive.html

本文介绍了如何将Hive数据导入到Elasticsearch。首先,需要将elasticsearch-hadoop-hive-2.1.2.jar添加到Hive的类路径中,然后在Hive中创建Elasticsearch外表,通过设置es.index.auto.create和es.nodes属性。接着创建Hive表并上传数据文件,最后使用SQL语句将Hive表的数据插入到Elasticsearch。
5178

被折叠的 条评论
为什么被折叠?



