两个框架要关联,需要相互操作的包 我这边是cdh6.3
cp /opt/cloudera/parcels/CDH/lib/hive/lib/hive-hbase-handler.jar /opt/cloudera/parcels/CDH/lib/hbase/lib
hive 建表语句
create table hive_hbase_student(
id int,
name string,
address string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:name,info:address")
TBLPROPERTIES("hbase.table.name"="hbase_student");
在hbase插入几条语句
hbase(main):016:0> put 'hbase_student','1001','info:name','nanxiuzi'
hbase(main):011:0> put 'hbase_student','1001','info:address','jiangxi ganzhou'
hbase(main):017:0> put 'hbase_student','1002','info:name','dongdongdong'
hbase(main):014:0> put 'hbase_student','1002','info:address','jiangxi fuzhou'
在hbase查看一下
hbase(main):018:0> scan 'hbase_student'
ROW COLUMN+CELL
1001 column=info:address, timestamp=1617777531243, value=jiangxi ganzhou
1001 column=info:name, timestamp=1617777900745, value=nanxiuzi
1002 column=info:address, timestamp=1617777564477, value=jiangxi fuzhou
1002 column=info:name, timestamp=1617777911978, value=dongdongdong
2 row(s)
Took 0.0160 seconds
在hive里面查看一下
hive> select * from hive_hbase_student;
21/04/07 14:47:20 INFO mapreduce.TableInputFormatBase: Input split length: 0 bytes.
1001 nanxiuzi jiangxi ganzhou
1002 dongdongdong jiangxi fuzhou
如果hbase里面表已经存在的话,需要在创建hive表的时候使用external关键字,不然会提示hbase表已经存在的错误
通过hbase进行数据插入的时候要注意类型匹配,虽然在hbase里面都是以字节存在的,但是hive里面是有类型的