flume.conf
a1.sinks.hbase-sink1.channel = ch1 a1.sinks.hbase-sink1.type = hbase a1.sinks.hbase-sink1.table = users a1.sinks.hbase-sink1.columnFamily= info a1.sinks.hbase-sink1.serializer=org.apache.flume.sink.hbase.RegexHbaseEventSerializer a1.sinks.hbase-sink1.serializer.regex=^(.+)\t(.+)\t(.+)$ a1.sinks.hbase-sink1.serializer.colNames=ROW_KEY,name,email a1.sinks.hbase-sink1.serializer.rowKeyIndex=0 a1.sinks.hbase-sink1.serializer.depositHeaders=true
Note
A:In order to using rowKey, you should configure rowKeyIndex=0 and colNames=ROW_KEY..... where in you post Josn data, rowkey must be the first filed.
B: If you want to put the headers info of your json post, you must set depositHeaders=true
a1.sources.http-source1.channels = ch1 a1.sources.http-source1.type = http a1.sources.http-source1.bind = 0.0.0.0 a1.sources.http-source1.port = 5140 a1.sources.http-source1.handler = org.apache.flume.source.http.JSONHandler
a1.channels = ch1 a1.sources = http-source1 a1.sinks = hbase-sink1
Hbase
#hbase shell
>create 'users' 'info'
Curl post json
curl -i -H 'content-type: application/json' -X POST
-d '[{"headers":{"userId":"9","name":"ZhangZiYi","phoneNumber":"1522323222"},
"body":"9\tZhangZiYi\tzy@163.com"}]' http://192.168.10.204:5140
Hbase result
>scan 'users'

Note: the name column, which content comes from headers and boy of JSON, will just overwrite the same content. Acutally, you can specify different column names to save the same content in different cells.
References
http://flume.apache.org/FlumeUserGuide.html#hbasesinks
http://thunderheadxpler.blogspot.jp/2013/09/bigdata-apache-flume-hdfs-and-hbase.html
本文详细介绍了Apache Flume配置文件及其使用场景,包括HTTP源、通道和HBasesink的配置。重点阐述了如何通过Flume将JSON格式的数据发送到HBase,并提供了实际的Curl命令示例进行验证。此外,还强调了在使用rowKey时的注意事项和如何在JSON数据中包含header信息的方法。
1594

被折叠的 条评论
为什么被折叠?



