Hive与HBase数据同步
方案一:Hive表关联HBase表
适用场景:数据量不大4T以下(走hbase的api导入数据)
1、HBase表存在的情况
创建HBaseb表
create 'shanchuan:user_info','base_info','extra_info'
HBase表插入数据
hbase(main):012:0> put 'shanchuan:user_info','JD000','base_info:user_name','Mary'
0 row(s) in 0.0770 seconds
hbase(main):013:0> put 'shanchuan:user_info','JD000','base_info:gender','F'
0 row(s) in 0.0040 seconds
hbase(main):014:0> put 'shanchuan:user_info','JD000','base_info:age','18'
0 row(s) in 0.0090 seconds
hbase(main):015:0> put 'shanchuan:user_info','JD000','extra_info:province','JiangSu'
0 row(s) in 0.0270 seconds
hbase(main):016:0> put 'shanchuan:user_info','JD000','extra_info:city','SuZhou'
0 row(s) in 0.0040 seconds
hbase(main):017:0> put 'shanchuan:user_info','JD001','base_info:user_name','Bob'
0 row(s) in 0.0050 seconds
hbase(main):018:0> put 'shanchuan:user_info','JD001','base_info:gender','M'
0 row(s) in 0.0040 seconds
hbase(main):019:0> put 'shanchuan:user_info','JD001','base_info:age','20'
0 row(s) in 0.0090 seconds
hbase(main):020:0> put 'shanchuan:user_info','JD001','extra_info:province','HuBei'
0 row(s) in 0.0050 seconds
hbase(main):021:0> put 'shanchuan:user_info','JD002','base_info:user_name','LiLi'
0 row(s) in 0.0040 seconds
hbase(main):022:0> put 'shanchuan:user_info','JD002','base_info:gender','F'
0 row(s) in 0.0030 seconds
hbase(main):023:0> put 'shanchuan:user_info','JD002','extra_info:province','BeiJing'
hbase(main):024:0> scan 'shanchuan:user_info'
ROW COLUMN+CELL
JD000 column=base_info:age, timestamp=1611200658140, value=18
JD000 column=base_info:gender, timestamp=1611200642667, value=F
JD000 column=base_info:user_name, timestamp=1611200605871, value=Mary
JD000 column=extra_info:city, timestamp=1611200706237, value=SuZhou
JD000 column=extra_info:province, timestamp=1611200683040, value=JiangSu
JD001 column=base_info:age, timestamp=1611200791216, value=20
JD001 column=base_info:gender, timestamp=1611200780452, value=M
JD001 column=base_info:user_name, timestamp=1611200766427, value=Bob
JD001 column=extra_info:province, timestamp=1611200815898, value=HuBei
JD002 column=base_info:gender, timestamp=1611200889454, value=F
JD002 column=base_info:user_name, timestamp=1611200875337, value=LiLi
JD002 column=extra_info:province, timestamp=1611200927801, value=BeiJing
3 row(s) in 0.0400 seconds
创建Hive表
注意这里只允许创建外部表
create external table hive_hbase_user_info(
user_id string comment '用户id',
user_name string comment '用户名',
user_gender string comment '性别',
user_age int comment '年龄',
add_province string comment '省份',
add_city string comment '城市'
)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandle