Hadoop(七) -- Yarn

本文详细介绍了Hadoop YARN的架构,包括ResourceManager和NodeManager的职责,以及YARN的资源调度过程。ResourceManager作为主控节点,负责资源分配、应用程序管理和监控,而NodeManager管理单个节点的资源。YARN采用Container作为资源调度的最小单位。资源调度过程中,MRAppMaster负责申请和跟踪任务资源,确保任务的执行。文章还概述了Job提交的流程,从客户端提交到任务执行的完整步骤。

一、Yarn概述

资源调度器,负责计算程序的资源调度。

Yarn采用主从架构,主节点RecourceManager,从节点NodeManager。

1. ResourceManager

        ResourceManager是基于应用程序对集群资源的需求进行调度的YARN集群主控节点,负责协调和管理整个集群(所有nodemanager的资源),相应用户提交的不同类型应用程序的解析,调度,监控等工作。ResourceManager为每个Application启动一个MRAppMaster,并且MRAppMaster分散在各个NodeManager节点。

        作用:

                1)处理客户端请求

                2)启动并监控MRAppMaster

                3)监控NodeManager

                4)资源的分配和调度

stu@stu-pc:~$ find / -name "hadoop" -type d 2>/dev/null | grep -i hadoop/home/stu/hadoop/usr/local/hadoop/usr/local/hadoop/etc/hadoop/usr/local/hadoop/share/hadoop/usr/local/hadoop/share/doc/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-app/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-core/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-plugins/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-uploader/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-common/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-hdfs-httpfs/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-hdfs-httpfs/apidocs/org/apache/hadoop/lib/service/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-auth/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-api/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-registry/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-client/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-server/hadoop-yarn-server-timelineservice-hbase-server-1/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-common/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-common/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/api/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/api/org/apache/hadoop/lib/service/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs-client/build/source/hadoop-hdfs-project/hadoop-hdfs-client/target/api/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-project-dist/hadoop-common/build/source/hadoop-common-project/hadoop-common/target/api/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs-rbf/build/source/hadoop-hdfs-project/hadoop-hdfs-rbf/target/api/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs/build/source/hadoop-hdfs-project/hadoop-hdfs/target/api/org/apache/hadoop/usr/local/hive/lib/php/packages/serde/org/apache/hadoop/usr/local/hive/apache-hive-3.1.2-bin/lib/php/packages/serde/org/apache/hadoop/tmp/hadoop/tmp/hadoop-unjar8342885896351558271/org/apache/hadoop/tmp/hadoop-unjar6484726597599168525/org/apache/hadoopstu@stu-pc:~$ ls /usr/local/hadoopbin data etc include lib libexec LICENSE.txt logs NOTICE.txt README.txt sbin share tmpstu@stu-pc:~$ ls /opt/hadoopls: 无法访问'/opt/hadoop': 没有那个文件目录stu@stu-pc:~$
最新发布
12-09
# 问题分析与解决方案 ## 当前环境状态分析 ### Hadoop安装路径确认 通过`find`命令发现存在两个相关路径: - `/home/stu/hadoop`:可能是用户个人配置目录 - `/usr/local/hadoop`:真正的Hadoop安装目录(包含bin、sbin、etc等标准结构) ### 目录结构验证 ```bash ls /usr/local/hadoop # 输出显示完整Hadoop安装结构: # bin, etc, lib, sbin, share, tmp 等目录齐全 ``` ### 关键发现 1. `$HADOOP_HOME` 应该指向 `/usr/local/hadoop` 2. `/usr/local/hadoop/tmp` 已存在,但可能权限有问题 3. 配置文件位于 `/usr/local/hadoop/etc/hadoop/` ## 完整修复步骤 ### 第一步:正确设置环境变量 ```bash # 设置HADOOP_HOME export HADOOP_HOME=/usr/local/hadoop export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH # 永久保存配置 echo 'export HADOOP_HOME=/usr/local/hadoop' >> ~/.bashrc echo 'export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH' >> ~/.bashrc source ~/.bashrc ``` ### 第二步:检查并修复tmp目录权限 ```bash # 检查当前tmp目录权限 ls -ld /usr/local/hadoop/tmp # 如果权限不足属主错误,进行修复 sudo chown -R stu:stu /usr/local/hadoop/tmp sudo chmod 755 /usr/local/hadoop/tmp # 确保MapReduce所需子目录存在 mkdir -p /usr/local/hadoop/tmp/mapred/{staging,local,system} ``` ### 第三步:配置core-site.xml 编辑 `/usr/local/hadoop/etc/hadoop/core-site.xml` 文件: ```xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/tmp</value> </property> </configuration> ``` ### 第四步:验证配置有效性 ```bash # 测试Hadoop是否能正常访问本地文件系统 hadoop fs -mkdir /test_dir hadoop fs -ls / # 清理测试文件 hadoop fs -rm -r /test_dir ``` ### 第五步:启动Hive测试 ```bash # 启动Hive并执行简单查询 hive -e "show databases;" ``` ## 根本原因总结 1. **环境变量未正确设置**:`$HADOOP_HOME` 为空导致无法定位配置文件 2. **临时目录权限问题**:虽然 `/usr/local/hadoop/tmp` 存在,但可能因之前操作导致权限异常 3. **配置指向不一致**:可能存在多个hadoop目录造成混淆 ## 最终建议 1. 统一使用 `/usr/local/hadoop` 作为主安装目录 2. 删除冗余的 `/home/stu/hadoop` 将其作为软链接: ```bash rm -rf /home/stu/hadoop ln -s /usr/local/hadoop /home/stu/hadoop ``` 3. 确保所有服务都基于正确的 `$HADOOP_HOME` 运行 ## 知识点 **Hadoop本地文件系统权限管理** Hadoop需对本地目录有读写权限,否则会抛出NativeIOException。 **Hadoop临时目录结构** `hadoop.tmp.dir`定义的基础路径下需包含mapred、dfs等子目录供组件使用。 **Hive on MapReduce执行流程** Hive将查询转化为MapReduce作业,需完成资源上传、目录创建等准备步骤。
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值