Hbse通过solr建立二级索引

本文介绍如何通过HBase和Solr实现数据的高效检索与处理。主要涉及配置Solr实例、创建Solr集合、设置HBase索引器以及配置Morphline文件等步骤。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

需要的组件:

hbase、hbase-value-store-indexer

一般有用到的命令和配置改动如下:

solrctl --zk xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181/solr instancedir \
--generate /opt/cdhsolr/myConfig

solrctl --zk xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181/solr instancedir \
--create xxxEventCollection /opt/cdhsolr/myConfig

solrctl --zk xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181/solr collection \
--create xxxEventCollection  -s 3 -r 2 -m 6

solrctl --zk xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181/solr collection --list

hbase-indexer add-indexer \
--name xxxEventIndexer \
--indexer-conf /opt/cdhsolr/myConfig/conf/morphline-hbase-mapper-xxxEventCollection.xml \
--connection-param solr.zk=xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181/solr \
--connection-param solr.collection=xxxEventCollection \
--zookeeper xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181

hbase-indexer list-indexers \
--zookeeper xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181

hbase-indexer delete-indexer --name xxxEventIndexer \
--zookeeper xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181

 
    <field name="userId" type="string" />
    <field name="roleId" type="string" />
    <field name="channelKey" type="string" />
    <field name="serverId" type="string" />
    <field name="logName" type="string" />

hbase-indexer delete-indexer --name xxxEventIndexer \
--zookeeper xxx.slave1:2181,xxx.slave2:2181,xxx.slave3:2181

solrctl collection --deletedocs xxxEventCollection

solrctl collection --delete xxxEventCollection

solrctl instancedir --delete xxxEventCollection


     <field name="userId" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
    <field name="roleId" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
    <field name="channelKey" type="string" indexed="true" stored="true" required="false" multiValued="false"/>
    <field name="serverId" type="string" indexed="true" stored="true" multiValued="false"/>
    <field name="logName" type="string" indexed="true" stored="true" multiValued="false"/>

Morphlines 文件配置如下:

注意:下面的配置中的括号一定要单独一行,不然无法识别

SOLR_LOCATOR : {
  # Name of solr collection
  #collection : xxxLogCollection
  
  # ZooKeeper ensemble
  zkHost : "$ZK_HOST" 
}


morphlines : [
{
id : xxxEventMorphline
importCommands : ["org.kitesdk.**", "com.ngdata.**"]

commands : [                    
  {
    extractHBaseCells {
      mappings : [
{
          inputColumn : "data:logName"
          outputField : "logName" 
          type : string 
          source : value
        }
       {
          inputColumn : "data:userId"
          outputField : "userId" 
          type : string 
          source : value
        }
{
          inputColumn : "data:roleId"
          outputField : "roleId" 
          type : string 
          source : value
        }
{
          inputColumn : "data:channelKey"
          outputField : "channelKey" 
          type : string 
          source : value
        }
{
          inputColumn : "data:serverId"
          outputField : "serverId" 
          type : string 
          source : value
        }
      ]
    }
  }


  { logDebug { format : "output record: {}", args : ["@{}"] } }
]
},
{
id : xxxActionMorphline
importCommands : ["org.kitesdk.**", "com.ngdata.**"]

commands : [                    
  {
    extractHBaseCells {
      mappings : [
{
          inputColumn : "cf1:user_id"
          outputField : "user_id" 
          type : string 
          source : value
        }
{
          inputColumn : "cf1:create_time"
          outputField : "create_time" 
          type : string 
          source : value
        }
{
          inputColumn : "cf1:log_name"
          outputField : "log_name" 
          type : string 
          source : value
        }
{
          inputColumn : "cf1:rn"
          outputField : "rn" 
          type : string 
          source : value
        }
{
          inputColumn : "cf1:p_log_name"
          outputField : "p_log_name" 
          type : string 
          source : value
        }
{
          inputColumn : "cf1:tl"
          outputField : "tl" 
          type : string 
          source : value
        }
{
          inputColumn : "cf1:dt"
          outputField : "dt" 
          type : string 
          source : value
        }
{
          inputColumn : "cf1:channel_key"
          outputField : "channel_key" 
          type : string 
          source : value
        }
{
          inputColumn : "cf1:server_id"
          outputField : "server_id" 
          type : string 
          source : value
        }
     
      ]
    }
  }


  { logDebug { format : "output record: {}", args : ["@{}"] } }
]
}
]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值