Upgrade YARN from Hadooop 2.6 to Hadoop 2.7

本文介绍了解决Hadoop YARN重启时出现的错误消息的方法。错误源于Protobuf格式在不同版本间的变化,通过逐步升级资源管理器和节点管理器可以避免此问题。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

When resourcemanage is restarted, it may print the following error messages.

2018-01-11 16:02:23,991 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   OPERATION=Application Finished - Failed TARGET=RMAppManager     RESULT=FAILURE  DESCRIPTION=App failed with state: FAILED       PERMISSIONS=Application application_1515636545026_0001 failed 2 times due to Error launching appattempt_1515636545026_0001_000002. Got exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message contained an invalid tag (zero).
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
        at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
        at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:99)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
        at com.sun.proxy.$Proxy82.startContainers(Unknown Source)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:119)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:251)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(com.google.protobuf.InvalidProtocolBufferException): Protocol message contained an invalid tag (zero).
        at org.apache.hadoop.ipc.Client.call(Client.java:1472)
2018-01-09 15:51:31,651 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_e06_1515465880229_0007_02_000002 and exit code: 1
ExitCodeException exitCode=1:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
        at org.apache.hadoop.util.Shell.run(Shell.java:482)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

This is because the protobuf format changed between hadoop 2.6 and hadoop 2.7.
To solve this problem, we must not run Resourcemanager and Nodemanager simultaneously.

The steps to upgrade the YARN

1. Backup nodemanagers setting.

cp  /home/hadoop/hadoop-2.6.0-cdh5.7.5/etc/hadoop/nodemanagers /home/hadoop/hadoop-2.6.0-cdh5.7.5/etc/hadoop/nodemanagers_bak 

2. Stop the standby resourcemanager

yarn-daemon.sh stop resourcemanager

3. Login in active resourcemanager, stop all nodemanager

echo "localhost" > /usr/local/hadoop/etc/hadoop/nodemanagers 
yarn rmadmin -refreshNodes

4. Wait until all nodemanager stopped, then stop active resourcemanager.

yarn-daemon.sh stop resourcemanager

5. Restart new version of Resourcemanager

export NEW_VERSION_OF_HADOOP_LOCATION=/home/hadoop/hadoop-2.7.5
export HADOOP_HOME=${NEW_VERSION_OF_HADOOP_LOCATION}
export HADOOP_COMMON_LIB_NATIVE_DIR=${NEW_VERSION_OF_HADOOP_LOCATION}/lib/native
export HADOOP_HDFS_HOME=${NEW_VERSION_OF_HADOOP_LOCATION}
export HADOOP_COMMON_HOME=${NEW_VERSION_OF_HADOOP_LOCATION}
export HADOOP_CONF_DIR=${NEW_VERSION_OF_HADOOP_LOCATION}/etc/hadoop
export HADOOP_MAPRED_HOME=${NEW_VERSION_OF_HADOOP_LOCATION}

${NEW_VERSION_OF_HADOOP_LOCATION}/sbin/yarn-daemon.sh start resourcemanager

By default, the resourcemanager will consider AM fail in 10 minutes if AM does not sends heartbeat. Set the following content to yarn-site.xml can relaunch application in 1 minutes.

   <property>
        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
        <value>60000</value>
    </property>
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值