------------------------------------------------------------------------遇到的问题导致我们要部署--------------------------------------------------
由于我们使用的flume对接了kafka通道。使用flume的sink充当kafka的消费者消费数据。无意间我们设置成了kafka的topic是四个分区然后启动了3个flume也就是kafka的3个消费者。但是由于kafka分区和消费者之间的分配策略。就注定有一个消费者要消费两个分区中的内容,所以就导致有两个分区中数据有挤压。而且挤压比较严重。实现不了实时日志的入库。具体消费情况如下图
注意:查看topic消费情况可以去kafka所在的机器上kafka的安装路径的bin路径下输入上边指令:
./kafka-consumer-groups.sh --bootstrap-server 地址 --describe --group groupname
从上边的消费情况的图可以看出43哪台机器消费的是0和1两个分区。压力比较大。
为了解决这一问题。我们想到了两个办法:
1.增加分区的数量。让分区数量和消费者数量可以整除。这样就把数据分摊到各个分区让另外两个消费者去协助压力比较大的机器一起消费
2.增加消费者,是分区数量等于消费者数量。
总之就是让其他消费者平摊一下压力比较大的机器。最终经过讨论。考虑到我们程序的处理速度比较快。为了更快的处理数据。决定采用第二种方案增加消费者的数量。但是遇到了另外一个问题。不知道非hbase部署的机器上是否可以部署flume。经过研究发现是可以的。
---------------------------------------------------------------在非hbase机器上部署flume--------------------------------------------------
正常的跟我们部署flume的步骤是一样的。详见:
https://blog.youkuaiyun.com/lzlnd/article/details/85059544
我们使用的配置文件
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#定义代理名称
#mm7mt.sources= oedipus_info
agent.sources= AvroIn
agent.sinks = HbaseOut
agent.channels = agentchannel
#具体定义source
agent.sources.AvroIn.type = avro
agent.sources.AvroIn.bind = 172.200.3.43
agent.sources.AvroIn.port = 42410
#mm7mt.sources.oedipus_info.type = exec
#mm7mt.sources.oedipus_info.command = tail -n 0 -F /var/log/oedipus/oedipus_info.log
#具体定义sink
agent.sinks.HbaseOut.type = asynchbase
agent.sinks.HbaseOut.table = monstor_mm7mt
agent.sinks.HbaseOut.columnFamily = cf1
agent.sinks.HbaseOut.batchSize = 10
agent.sinks.HbaseOut.serializer = com.caissa.chador_flume.AsyncHbaseAllLogEventSerializer
agent.sinks.HbaseOut.serializer.columns = xunqi_number,protocol_type,message_type,submit_number,smsreq_rid,message_number,company_code,user_name,channel_value,billingusers_number,billing_type,aimphone_number,phone_number,aim_phone,appcode,is_status,messagevalid_time,message_sendtime,mobilevalide_number,valid_type,expenses,link_id,tp_pid,tp_udhi,message_format,message_code,mobiledeal_number,moblie_result,titile_length,mmcresouce_id,mmc_titile
#具体定义channel---filechannel
#mm7mt.channels.mm7mtchannel.type = file
#mm7mt.channels.mm7mtchannel.checkpointDir = /data/dataeckPoint/mm7mt
#mm7mt.channels.mm7mtchannel.backupCheckpointDir = /data/dataeckPoint/mm7mt
#mm7mt.channels.mm7mtchannel.keep-alive=10
#==========memorychannel================
#mm7mt.channels.mm7mtchannel.type = memory
#mm7mt.channels.mm7mtchannel.capacity=1000000
#mm7mt.channels.mm7mtchannel.keep-alive=10
#mm7mt.channels.mm7mtchannel.transactioncapacity=1000
#==========kafkachannel=================
agent.channels.agentchannel.type = org.apache.flume.channel.kafka.KafkaChannel
agent.channels.agentchannel.brokerList = 192.100.4.3:9092,192.100.4.13:9092,192.100.4.15:9092
agent.channels.agentchannel.zookeeperConnect = 192.100.4.3:2181,192.100.4.13:2181,192.100.4.15:2181
agent.channels.agentchannel.topic = FLUME_TEST_TOPIC
agent.channels.agentchannel.parseAsFlumeEvent = false
agent.channels.agentchannel.heartbeat.interval.ms=20000
agent.channels,agentchannel.group.id=flume
#agent.channels.agentchannel.consumer.request.timeout.ms=110000
#agent.channels.agentchannel.consumer.fetch.max.wait.ms=1000
#agent.channels.agentchannel.consumer.max.poll.interval.ms=300000
#agent.channels.agentchannel.consumer.max.poll.records=100
#组装source channel sink
#mm7mt.sources.oedipus_info.channels= mc1
agent.sources.AvroIn.channels= agentchannel
agent.sinks.HbaseOut.channel = agentchannel
使用这个配置文件启动的时候出现的错误:
1.
java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
at org.apache.flume.sink.hbase.HBaseSink.<init>(HBaseSink.java:114)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:45)
at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:408)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 17 more
解决办法:
在flume的lib目录下添加jar包,相关jar包可以去hbase的安装路径下的lib文件夹下寻找,添加好jar包之后重启即可
hbase-client-1.2.0-cdh5.15.1.jar
hbase-common-1.2.0-cdh5.15.1.jar
hbase-protocol-1.2.0-cdh5.15.1.jar
htrace-core-3.2.0-incubating.jar
2. 接着遇到第二个问题:zookeeper连接的是localhost的但是我们的机器上并没有zookeeper需要连接远程的
25 Nov 2019 11:35:06,904 INFO [lifecycleSupervisor-1-3] (org.apache.zookeeper.Environment.logEnv:100) - Client environment:java.library.path=
25 Nov 2019 11:35:06,904 INFO [lifecycleSupervisor-1-3] (org.apache.zookeeper.Environment.logEnv:100) - Client environment:java.io.tmpdir=/tmp
25 Nov 2019 11:35:06,904 INFO [lifecycleSupervisor-1-3] (org.apache.zookeeper.Environment.logEnv:100) - Client environment:java.compiler=<NA>
25 Nov 2019 11:35:06,904 INFO [lifecycleSupervisor-1-3] (org.apache.zookeeper.Environment.logEnv:100) - Client environment:os.name=Linux
25 Nov 2019 11:35:06,904 INFO [lifecycleSupervisor-1-3] (org.apache.zookeeper.Environment.logEnv:100) - Client environment:os.arch=amd64
25 Nov 2019 11:35:06,904 INFO [lifecycleSupervisor-1-3] (org.apache.zookeeper.Environment.logEnv:100) - Client environment:os.version=2.6.32-642.6.2.el6.x86_64
25 Nov 2019 11:35:06,904 INFO [lifecycleSupervisor-1-3] (org.apache.zookeeper.Environment.logEnv:100) - Client environment:user.name=root
25 Nov 2019 11:35:06,904 INFO [lifecycleSupervisor-1-3] (org.apache.zookeeper.Environment.logEnv:100) - Client environment:user.home=/root
25 Nov 2019 11:35:06,904 INFO [lifecycleSupervisor-1-3] (org.apache.zookeeper.Environment.logEnv:100) - Client environment:user.dir=/usr/libra/flume/lib
25 Nov 2019 11:35:06,905 INFO [lifecycleSupervisor-1-3] (org.apache.zookeeper.ZooKeeper.<init>:438) - Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x7934e2510x0, quorum=localhost:2181, baseZNode=/hbase
25 Nov 2019 11:35:06,939 INFO [lifecycleSupervisor-1-3-SendThread(VM_48_19_centos:2181)] (org.apache.zookeeper.ClientCnxn$SendThread.logStartConnect:975) - Opening socket connection to server VM_48_19_centos/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
25 Nov 2019 11:35:06,980 WARN [lifecycleSupervisor-1-3-SendThread(VM_48_19_centos:2181)] (org.apache.zookeeper.ClientCnxn$SendThread.run:1102) - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
25 Nov 2019 11:35:07,095 DEBUG [lifecycleSupervisor-1-3] (org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.retryOrThrow:272) - Possibly transient ZooKeeper, quorum=localhost:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
25 Nov 2019 11:35:08,096 INFO [lifecycleSupervisor-1-3-SendThread(VM_48_19_centos:2181)] (org.apache.zookeeper.ClientCnxn$SendThread.logStartConnect:975) - Opening socket connection to server VM_48_19_centos/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
25 Nov 2019 11:35:08,096 WARN [lifecycleSupervisor-1-3-SendThread(VM_48_19_centos:2181)] (org.apache.zookeeper.ClientCnxn$SendThread.run:1102) - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
解决办法:添加配置然后重启即可
agent-hbase.sinks.sink_hbase.zookeeperQuorum = x.x.x.x:2181,x.x.x.x:2181,x.x.x.x:2181
拷贝hbase下hbase-site.xml到flume的conf目录下