需要的组件:
hbase、hbase-value-store-indexer
一般有用到的命令和配置改动如下:
solrctl --zk xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181/solr instancedir \
--generate /opt/cdhsolr/myConfig
solrctl --zk xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181/solr instancedir \
--create xxxEventCollection /opt/cdhsolr/myConfig
solrctl --zk xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181/solr collection \
--create xxxEventCollection -s 3 -r 2 -m 6
solrctl --zk xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181/solr collection --list
hbase-indexer add-indexer \
--name xxxEventIndexer \
--indexer-conf /opt/cdhsolr/myConfig/conf/morphline-hbase-mapper-xxxEventCollection.xml \
--connection-param solr.zk=xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181/solr \
--connection-param solr.collection=xxxEventCollection \
--zookeeper xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181
hbase-indexer list-indexers \
--zookeeper xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181
hbase-indexer delete-indexer --name xxxEventIndexer \
--zookeeper xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181
<field name="userId" type="string" />
<field name="roleId" type="string" />
<field name="channelKey" type="string" />
<field name="serverId" type="string" />
<field name="logName" type="string" />
hbase-indexer delete-indexer --name xxxEventIndexer \
--zookeeper xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181
solrctl collection --deletedocs xxxEventCollection
solrctl collection --delete xxxEventCollection
solrctl instancedir --delete xxxEventCollection
<field name="userId" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="roleId" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="channelKey" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
<field name="serverId" type="string" indexed="true" stored="true" multiValued="false"/>
<field name="logName" type="string" indexed="true" stored="true" multiValued="false"/>
Morphlines 文件配置如下:
注意:下面的配置中的括号一定要单独一行,不然无法识别
SOLR_LOCATOR : {
# Name of solr collection
#collection : xxxLogCollection
# ZooKeeper ensemble
zkHost : "$ZK_HOST"
}
morphlines : [
{
id : xxxEventMorphline
importCommands : ["org.kitesdk.**", "com.ngdata.**"]
commands : [
{
extractHBaseCells {
mappings : [
{
inputColumn : "data:logName"
outputField : "logName"
type : string
source : value
}
{
inputColumn : "data:userId"
outputField : "userId"
type : string
source : value
}
{
inputColumn : "data:roleId"
outputField : "roleId"
type : string
source : value
}
{
inputColumn : "data:channelKey"
outputField : "channelKey"
type : string
source : value
}
{
inputColumn : "data:serverId"
outputField : "serverId"
type : string
source : value
}
]
}
}
{ logDebug { format : "output record: {}", args : ["@{}"] } }
]
},
{
id : xxxActionMorphline
importCommands : ["org.kitesdk.**", "com.ngdata.**"]
commands : [
{
extractHBaseCells {
mappings : [
{
inputColumn : "cf1:user_id"
outputField : "user_id"
type : string
source : value
}
{
inputColumn : "cf1:create_time"
outputField : "create_time"
type : string
source : value
}
{
inputColumn : "cf1:log_name"
outputField : "log_name"
type : string
source : value
}
{
inputColumn : "cf1:rn"
outputField : "rn"
type : string
source : value
}
{
inputColumn : "cf1:p_log_name"
outputField : "p_log_name"
type : string
source : value
}
{
inputColumn : "cf1:tl"
outputField : "tl"
type : string
source : value
}
{
inputColumn : "cf1:dt"
outputField : "dt"
type : string
source : value
}
{
inputColumn : "cf1:channel_key"
outputField : "channel_key"
type : string
source : value
}
{
inputColumn : "cf1:server_id"
outputField : "server_id"
type : string
source : value
}
]
}
}
{ logDebug { format : "output record: {}", args : ["@{}"] } }
]
}
]