HADOOP的基础知识-1

你们使用的 hadoop 是什么环境什么版本的?

hadoop   开源版 2.8

hadoop   cdh 版本 5

 hadoop 有哪三大组件?
hdfs hadoop 的分布式文件管理系统
mapreduce : 数据的计算引擎
yarn : 资源管理和调度系统
hadoop 平台,你用过和知道哪些不同的组件?
离线的部分:sqoop yarn hdfs mapreduce hive
实时的部分:flume(日志信息的收集) kafka(消息队列的处理) hbase(一种列式存储的数据库)
spark(基于内存的计算引擎) flink(流式处理的计算引擎)
hadoop 里面,hdfs 数据块是多大一块?
128M
数据默认保存几份?
3
hdfs 里面由哪几个组件构成?
datanode namenode secondarynamenode
hdfs 里面的几个组件,分别有哪些功能和作用?
secondarynamenode:服务器数据的收集,将信息传递给 namenode
namenode:负责和客户端进行沟通
datanode:负责存储数据
hadoop 的基础服务有哪几个?
datanode namenode secondarynamenode jps resourcemanager nodemanager
hdfs 里面,写入数据(上传文件)和读取数据(下载文件),过程流程和原理是
什么?
读取数据:
1.客户端申请某个位置的文件或者数据
2.namenode 响应申请,并且将文件和数据所在的 datanode 节点信息列表返回给客户端
3.客户端根据节点信息去向 datanode 申请数据的读取
4.datanode 响应成功给客户端
5.客户端开始申请读取 block1
6.datanode 返回 block1 的数据
7.持续申请后面的其他 block 数据
8.datanode 持续的返回剩下的其他数据
写入数据:
1.客户端要申请写入一个数据
2.namenode 审核文件和数据的合法性
3.namenode 返回允许的响应
4.客户端开始申请写入
5.namenode 返回 datanode 的节点信息
6.客户端找到 datanode 开始申请写入数据
7.datanode 同意进行数据写入
8.客户端开始上传数据
8.1 datanode 开始向其他的 datanode 申请备份
8.2 其他的 datanode 同意备份
8.3 开始备份
8.4 备份完成
9. datanode 回应客户端表示写入成功
stu@stu-pc:~$ find / -name "hadoop" -type d 2>/dev/null | grep -i hadoop/home/stu/hadoop/usr/local/hadoop/usr/local/hadoop/etc/hadoop/usr/local/hadoop/share/hadoop/usr/local/hadoop/share/doc/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-app/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-core/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-hs-plugins/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-uploader/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-common/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-hdfs-httpfs/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-hdfs-httpfs/apidocs/org/apache/hadoop/lib/service/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-auth/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-api/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-registry/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-client/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-server/hadoop-yarn-server-timelineservice-hbase-server-1/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-common/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-yarn/hadoop-yarn-common/apidocs/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/api/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/api/org/apache/hadoop/lib/service/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs-client/build/source/hadoop-hdfs-project/hadoop-hdfs-client/target/api/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-project-dist/hadoop-common/build/source/hadoop-common-project/hadoop-common/target/api/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs-rbf/build/source/hadoop-hdfs-project/hadoop-hdfs-rbf/target/api/org/apache/hadoop/usr/local/hadoop/share/doc/hadoop/hadoop-project-dist/hadoop-hdfs/build/source/hadoop-hdfs-project/hadoop-hdfs/target/api/org/apache/hadoop/usr/local/hive/lib/php/packages/serde/org/apache/hadoop/usr/local/hive/apache-hive-3.1.2-bin/lib/php/packages/serde/org/apache/hadoop/tmp/hadoop/tmp/hadoop-unjar8342885896351558271/org/apache/hadoop/tmp/hadoop-unjar6484726597599168525/org/apache/hadoopstu@stu-pc:~$ ls /usr/local/hadoopbin data etc include lib libexec LICENSE.txt logs NOTICE.txt README.txt sbin share tmpstu@stu-pc:~$ ls /opt/hadoopls: 无法访问'/opt/hadoop': 没有那个文件或目录stu@stu-pc:~$
最新发布
12-09
# 问题分析与解决方案 ## 当前环境状态分析 ### Hadoop安装路径确认 通过`find`命令发现存在两个相关路径: - `/home/stu/hadoop`:可能是用户个人配置目录 - `/usr/local/hadoop`:真正的Hadoop安装目录(包含bin、sbin、etc等标准结构) ### 目录结构验证 ```bash ls /usr/local/hadoop # 输出显示完整Hadoop安装结构: # bin, etc, lib, sbin, share, tmp 等目录齐全 ``` ### 关键发现 1. `$HADOOP_HOME` 应该指向 `/usr/local/hadoop` 2. `/usr/local/hadoop/tmp` 已存在,但可能权限有问题 3. 配置文件位于 `/usr/local/hadoop/etc/hadoop/` ## 完整修复步骤 ### 第一步:正确设置环境变量 ```bash # 设置HADOOP_HOME export HADOOP_HOME=/usr/local/hadoop export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH # 永久保存配置 echo 'export HADOOP_HOME=/usr/local/hadoop' >> ~/.bashrc echo 'export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH' >> ~/.bashrc source ~/.bashrc ``` ### 第二步:检查并修复tmp目录权限 ```bash # 检查当前tmp目录权限 ls -ld /usr/local/hadoop/tmp # 如果权限不足或属主错误,进行修复 sudo chown -R stu:stu /usr/local/hadoop/tmp sudo chmod 755 /usr/local/hadoop/tmp # 确保MapReduce所需子目录存在 mkdir -p /usr/local/hadoop/tmp/mapred/{staging,local,system} ``` ### 第三步:配置core-site.xml 编辑 `/usr/local/hadoop/etc/hadoop/core-site.xml` 文件: ```xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/tmp</value> </property> </configuration> ``` ### 第四步:验证配置有效性 ```bash # 测试Hadoop是否能正常访问本地文件系统 hadoop fs -mkdir /test_dir hadoop fs -ls / # 清理测试文件 hadoop fs -rm -r /test_dir ``` ### 第五步:启动Hive测试 ```bash # 启动Hive并执行简单查询 hive -e "show databases;" ``` ## 根本原因总结 1. **环境变量未正确设置**:`$HADOOP_HOME` 为空导致无法定位配置文件 2. **临时目录权限问题**:虽然 `/usr/local/hadoop/tmp` 存在,但可能因之前操作导致权限异常 3. **配置指向不一致**:可能存在多个hadoop目录造成混淆 ## 最终建议 1. 统一使用 `/usr/local/hadoop` 作为主安装目录 2. 删除冗余的 `/home/stu/hadoop` 或将其作为软链接: ```bash rm -rf /home/stu/hadoop ln -s /usr/local/hadoop /home/stu/hadoop ``` 3. 确保所有服务都基于正确的 `$HADOOP_HOME` 运行 ## 知识点 **Hadoop本地文件系统权限管理** Hadoop需对本地目录有读写权限,否则会抛出NativeIOException。 **Hadoop临时目录结构** `hadoop.tmp.dir`定义的基础路径下需包含mapred、dfs等子目录供组件使用。 **Hive on MapReduce执行流程** Hive将查询转化为MapReduce作业,需完成资源上传、目录创建等准备步骤。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值