5. Hbase master HA使用

本文介绍HBase高可用(HA)的实现原理及配置方法,通过Zookeeper的MasterElection机制确保集群中始终有一个活跃的HMaster。演示了如何搭建HBase HA环境并验证自动故障切换的有效性。

实验环境

master:192.168.0.160
slave1:192.168.0.161
zookeeper:192.168.0.161
hadoop版本:2.6.5
主机操作系统:ubuntu-16.04

Hbase HA实现原理

HMaster HA不需要额外的配置,HBase中可以启动多个HMaster,通过Zookeeper的Master Election机制保证总有一个Master运行。

所以这里要配置Hbase高可用的话,只需要启动两个HMaster,让Zookeeper自己去选择一个Master Active。

在启动之前需要搭建好Hbase完全分布式环境,可以参考:
https://blog.youkuaiyun.com/cl2010abc/article/details/80822553

HA启动与验证

  1. 当集群启动成功后,master节点上会启动一个HMaster进程,我们还需要在slave1节点上启动一个HMaster进程。

    [hadoop@slave1 hbase-1.2.6]$ ./bin/hbase-daemon.sh start master
    

    starting master, logging to /home/hadoop/software/hbase-1.2.6/logs/hbase-hadoop-master-slave1.out
    Java HotSpot™ 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
    Java HotSpot™ 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

  2. 通过web ui查看两个HMaster进程的工作状态
    这里写图片描述
    这里写图片描述
    从上图中可以看出master节点上的HMaster进程被选为真正的"master"。

  3. 查看Hbase在zookeeper注册的节点信息

    # 查看master节点信息
    [zk: localhost:2181(CONNECTED) 15] get /hbase/master
    �master:160001䛈MEPBUF
    
    master�}����,�}
    cZxid = 0x28fb
    ctime = Sun Jul 22 20:04:51 PDT 2018
    mZxid = 0x28fb
    mtime = Sun Jul 22 20:04:51 PDT 2018
    pZxid = 0x28fb
    cversion = 0
    dataVersion = 0
    aclVersion = 0
    ephemeralOwner = 0x164c4fb83eb0031
    dataLength = 54
    numChildren = 0 
    
    # 查看backup-masters节点
    [zk: localhost:2181(CONNECTED) 29] ls /hbase/backup-masters
    [slave1,16000,1532316403408]
    

从zookeeper注册的节点信息来看,master节点上HMaster为主master,slave1上的为备用master。

  1. 模拟HMaster意外挂掉,备用master能否成功切换。
    kill 掉master上的HMaster进程。

    [hadoop@master software]$ jps
    4002 HRegionServer
    2839 DataNode
    5897 Main
    3034 SecondaryNameNode
    6714 Jps
    2730 NameNode
    6223 HMaster
    [hadoop@master software]$ kill -9 6223
    

查看slave1上HMaster是否切换为主HMaster
这里写图片描述

从图中可以看出HMaster已成功切换。

总结

该文演示了Hbase HA的使用与自动故障切换。

使用scala编写spark任务在华为云平台运行失败报错如下 2025-11-10 14:45:44,759 [main] INFO org.apache.spark.deploy.yarn.Client - Application report for application_1761562621540_474713 (state: RUNNING) 2025-11-10 14:45:45,763 [main] INFO org.apache.spark.deploy.yarn.Client - Application report for application_1761562621540_474713 (state: FINISHED) 2025-11-10 14:45:45,764 [main] INFO org.apache.spark.deploy.yarn.Client - client token: Token { kind: YARN_CLIENT_TOKEN, service: } diagnostics: User class threw exception: java.io.IOException: BulkLoad encountered an unrecoverable problem at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.bulkLoadPhase(LoadIncrementalHFiles.java:546) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.performBulkLoad(LoadIncrementalHFiles.java:475) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:379) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:293) at com.bmsoft.scala.hbase.HFileRDDHelper.saveAsHFile(HFileSupport.scala:208) at com.bmsoft.scala.hbase.HFileRDDSimpleS.toHBaseBulk(HFileSupport.scala:270) at com.bmsoft.scala.hbase.HbaseOperate$.toHbaseBulkS(HbaseOperate.scala:198) at com.bmsoft.operate.cust.DwsVsCustVoltratePeriodNew.outputData(DwsVsCustVoltratePeriodNew.scala:139) at com.bmsoft.operate.cust.DwsVsCustVoltratePeriodNew.executor(DwsVsCustVoltratePeriodNew.scala:55) at com.bmsoft.task.cust.DwsVsCustVoltratePeriodTask$.func(DwsVsCustVoltratePeriodTask.scala:17) at com.bmsoft.task.cust.DwsVsCustVoltratePeriodTask$.$anonfun$main$1(DwsVsCustVoltratePeriodTask.scala:29) at com.bmsoft.task.cust.DwsVsCustVoltratePeriodTask$.$anonfun$main$1$adapted(DwsVsCustVoltratePeriodTask.scala:29) at com.bmsoft.scala.utils.LeoUtils.package$.$anonfun$taskEntry_daYu$3(LeoUtils.scala:437) at scala.runtime.java8.JFunction1$mcVJ$sp.apply(JFunction1$mcVJ$sp.java:23) at scala.collection.immutable.NumericRange.foreach(NumericRange.scala:74) at com.bmsoft.scala.utils.LeoUtils.package$.taskEntry_daYu(LeoUtils.scala:424) at com.bmsoft.task.cust.DwsVsCustVoltratePeriodTask$.main(DwsVsCustVoltratePeriodTask.scala:29) at com.bmsoft.task.cust.DwsVsCustVoltratePeriodTask.main(DwsVsCustVoltratePeriodTask.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:733) Caused by: org.apache.hadoop.hbase.quotas.SpaceLimitingException: (unknown policy) Could not verify length of file to bulk load: hdfs://hacluster/tmp/dws_vs_cust_voltrate_period_9ec23c29-36e2-4245-b345-e1c97e1d2797/C/426d78deaae74dabb96fd367b18496d2 at org.apache.hadoop.hbase.quotas.policies.AbstractViolationPolicyEnforcement.getFileSize(AbstractViolationPolicyEnforcement.java:98) at org.apache.hadoop.hbase.quotas.policies.MissingSnapshotViolationPolicyEnforcement.computeBulkLoadSize(MissingSnapshotViolationPolicyEnforcement.java:53) at org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:2716) at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42280) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:455) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) Caused by: java.io.FileNotFoundException: File does not exist: hdfs://hacluster/tmp/dws_vs_cust_voltrate_period_9ec23c29-36e2-4245-b345-e1c97e1d2797/C/426d78deaae74dabb96fd367b18496d2 at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) at org.apache.hadoop.hbase.quotas.policies.AbstractViolationPolicyEnforcement.getFileSize(AbstractViolationPolicyEnforcement.java:95) ... 7 more at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:99) at org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:89) at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:367) at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.handleRemoteException(ProtobufUtil.java:355) at org.apache.hadoop.hbase.client.SecureBulkLoadClient.secureBulkLoadHFiles(SecureBulkLoadClient.java:157) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$1.rpcCall(LoadIncrementalHFiles.java:578) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles$1.rpcCall(LoadIncrementalHFiles.java:564) at org.apache.hadoop.hbase.client.RegionServerCallable.call(RegionServerCallable.java:127) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:110) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.tryAtomicRegionLoad(LoadIncrementalHFiles.java:885) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.tryAtomicRegionLoad(LoadIncrementalHFiles.java:862) at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.lambda$bulkLoadPhase$4(LoadIncrementalHFiles.java:524) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.quotas.SpaceLimitingException): org.apache.hadoop.hbase.quotas.SpaceLimitingException: NoQuota Could not verify length of file to bulk load: hdfs://hacluster/tmp/dws_vs_cust_voltrate_period_9ec23c29-36e2-4245-b345-e1c97e1d2797/C/426d78deaae74dabb96fd367b18496d2 at org.apache.hadoop.hbase.quotas.policies.AbstractViolationPolicyEnforcement.getFileSize(AbstractViolationPolicyEnforcement.java:98) at org.apache.hadoop.hbase.quotas.policies.MissingSnapshotViolationPolicyEnforcement.computeBulkLoadSize(MissingSnapshotViolationPolicyEnforcement.java:53) at org.apache.hadoop.hbase.regionserver.RSRpcServices.bulkLoadHFile(RSRpcServices.java:2716) at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42280) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:455) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) Caused by: java.io.FileNotFoundException: File does not exist: hdfs://hacluster/tmp/dws_vs_cust_voltrate_period_9ec23c29-36e2-4245-b345-e1c97e1d2797/C/426d78deaae74dabb96fd367b18496d2 at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645) at org.apache.hadoop.hbase.quotas.policies.AbstractViolationPolicyEnforcement.getFileSize(AbstractViolationPolicyEnforcement.java:95) ... 7 more at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:395) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:97) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:429) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:425) at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:118) at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:133) at org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.readResponse(NettyRpcDuplexHandler.java:162) at org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.channelRead(NettyRpcDuplexHandler.java:192) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) at org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at org.apache.hbase.thirdparty.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ... 1 more ApplicationMaster host: node-group-3ntAo0008 ApplicationMaster RPC port: 22867 queue: default start time: 1762751893580 final status: FAILED tracking URL: https://node-master5qydo:8090/proxy/application_1761562621540_474713/ user: zhangdacheng
最新发布
11-11
评论 6
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值