Hadoop M/R Workcount的运行

本文详细介绍如何使用Hadoop进行数据处理,包括数据导入、Java程序打包、集群运行WordCount示例等步骤。此外还提供了HDFS常用命令供读者参考。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

一、添加数据

#进入hadoop安装目录

cd /usr/local/hadoop

#创建示例文件:input

#在里面输入以下内容:

Hello world, Bye world!

vi input



 二、将你写的java程序导出jar包

在Eclipse上右键Export选择JAVA项 导出

在hadoop下新建sbin目录存放jar文件

#mkdir  /usr/hadoop/sbin

将jar包放在/usr/hadoop/sbin目录下



 

三、在集群上运行

在hdfs上创建文件路径

# hadoop fs -mkdir  /user/root/test/wordcount

将输入文件拷贝到hdfs

#hadoop fs -put input /user/root/test/wordcount



#执行程序

#hadoop jar /usr/hadoop/sbin/wordcount.jar /user/root/test/wordcount/ output

#程序在执行完毕后查看执行结果

#hadoop fs -cat /user/root/test/wordcount/output/*

#程序将会输出下面的结果:

Bye 1
Hello 1
world! 1
world, 1
至此完毕!

 

 

附录:HDFS文件命令

hadoop fs 
查看Hadoop HDFS支持的所有命令 

hadoop fs –ls 
列出目录及文件信息 

hadoop fs –lsr 
循环列出目录、子目录及文件信息 

hadoop fs –put test.txt /user/sunlightcs 
将本地文件系统的test.txt复制到HDFS文件系统的/user/sunlightcs目录下 

hadoop fs –get /user/sunlightcs/test.txt . 
将HDFS中的test.txt复制到本地文件系统中,与-put命令相反 

hadoop fs –cat /user/sunlightcs/test.txt 
查看HDFS文件系统里test.txt的内容 

hadoop fs –tail /user/sunlightcs/test.txt 
查看最后1KB的内容 

hadoop fs –rm /user/sunlightcs/test.txt 
从HDFS文件系统删除test.txt文件,rm命令也可以删除空目录 

hadoop fs –rmr /user/sunlightcs  
删除/user/sunlightcs目录以及所有子目录 

hadoop fs –copyFromLocal test.txt /user/sunlightcs/test.txt 
从本地文件系统复制文件到HDFS文件系统,等同于put命令 

hadoop fs –copyToLocal /user/sunlightcs/test.txt test.txt 
从HDFS文件系统复制文件到本地文件系统,等同于get命令 

hadoop fs –chgrp [-R] /user/sunlightcs 
修改HDFS系统中/user/sunlightcs目录所属群组,选项-R递归执行,跟linux命令一样 

hadoop fs –chown [-R] /user/sunlightcs 
修改HDFS系统中/user/sunlightcs目录拥有者,选项-R递归执行 

hadoop fs –chmod [-R] MODE /user/sunlightcs 
修改HDFS系统中/user/sunlightcs目录权限,MODE可以为相应权限的3位数或+/-{rwx},选项-R递归执行 

hadoop fs –count [-q] PATH 
查看PATH目录下,子目录数、文件数、文件大小、文件名/目录名 

hadoop fs –cp SRC [SRC …] DST       
将文件从SRC复制到DST,如果指定了多个SRC,则DST必须为一个目录 

hadoop fs –du PATH 
显示该目录中每个文件或目录的大小 

hadoop fs –dus PATH 
类似于du,PATH为目录时,会显示该目录的总大小 

hadoop fs –expunge  
清空回收站,文件被删除时,它首先会移到临时目录.Trash/中,当超过延迟时间之后,文件才会被永久删除 

hadoop fs –getmerge SRC [SRC …] LOCALDST [addnl]      
获取由SRC指定的所有文件,将它们合并为单个文件,并写入本地文件系统中的LOCALDST,选项addnl将在每个文件的末尾处加上一个换行符 

hadoop fs –touchz PATH   
创建长度为0的空文件 

hadoop fs –test –[ezd] PATH     
对PATH进行如下类型的检查: 
-e PATH是否存在,如果PATH存在,返回0,否则返回1 
-z 文件是否为空,如果长度为0,返回0,否则返回1 
-d 是否为目录,如果PATH为目录,返回0,否则返回1 

hadoop fs –text PATH 
显示文件的内容,当文件为文本文件时,等同于cat,文件为压缩格式(gzip以及hadoop的二进制序列文件格式)时,会先解压缩 


root@job-da8abcdd-9948-4878-9d20-371dceb00ee1-master-0:/home# start-dfs.sh Starting namenodes on [master] /opt/hadoop/hadoop/bin/hdfs: 26: function: not found /opt/hadoop/hadoop/bin/hdfs: 28: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 29: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 30: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 31: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 32: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 33: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 35: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 36: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 37: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 38: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 39: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 40: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 41: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 42: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 43: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 44: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 45: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 46: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 47: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 48: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 49: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 50: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 51: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 52: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 53: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 54: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 55: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 56: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 57: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 58: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 59: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 60: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 61: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 62: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 63: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 64: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 65: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 66: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 67: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 68: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 69: hadoop_generate_usage: not found /opt/hadoop/hadoop/bin/hdfs: 77: function: not found /opt/hadoop/hadoop/bin/hdfs: 218: hadoop_validate_classname: not found /opt/hadoop/hadoop/bin/hdfs: 219: hadoop_exit_with_usage: not found /opt/hadoop/hadoop/bin/hdfs: 226: [[: not found /opt/hadoop/hadoop/bin/hdfs: 235: [[: not found ERROR: Cannot execute /opt/hadoop/hadoop/bin/../libexec/hdfs-config.sh. Starting datanodes /opt/hadoop/hadoop/bin/hdfs: 26: function: not found /opt/hadoop/hadoop/bin/hdfs: 28: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 29: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 30: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 31: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 32: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 33: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 35: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 36: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 37: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 38: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 39: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 40: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 41: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 42: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 43: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 44: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 45: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 46: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 47: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 48: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 49: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 50: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 51: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 52: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 53: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 54: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 55: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 56: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 57: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 58: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 59: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 60: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 61: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 62: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 63: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 64: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 65: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 66: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 67: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 68: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 69: hadoop_generate_usage: not found /opt/hadoop/hadoop/bin/hdfs: 77: function: not found /opt/hadoop/hadoop/bin/hdfs: 218: hadoop_validate_classname: not found /opt/hadoop/hadoop/bin/hdfs: 219: hadoop_exit_with_usage: not found /opt/hadoop/hadoop/bin/hdfs: 226: [[: not found /opt/hadoop/hadoop/bin/hdfs: 235: [[: not found ERROR: Cannot execute /opt/hadoop/hadoop/bin/../libexec/hdfs-config.sh. Starting secondary namenodes [job-da8abcdd-9948-4878-9d20-371dceb00ee1-master-0] /opt/hadoop/hadoop/bin/hdfs: 26: function: not found /opt/hadoop/hadoop/bin/hdfs: 28: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 29: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 30: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 31: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 32: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 33: hadoop_add_option: not found /opt/hadoop/hadoop/bin/hdfs: 35: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 36: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 37: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 38: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 39: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 40: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 41: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 42: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 43: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 44: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 45: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 46: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 47: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 48: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 49: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 50: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 51: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 52: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 53: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 54: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 55: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 56: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 57: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 58: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 59: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 60: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 61: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 62: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 63: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 64: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 65: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 66: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 67: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 68: hadoop_add_subcommand: not found /opt/hadoop/hadoop/bin/hdfs: 69: hadoop_generate_usage: not found /opt/hadoop/hadoop/bin/hdfs: 77: function: not found /opt/hadoop/hadoop/bin/hdfs: 218: hadoop_validate_classname: not found /opt/hadoop/hadoop/bin/hdfs: 219: hadoop_exit_with_usage: not found /opt/hadoop/hadoop/bin/hdfs: 226: [[: not found /opt/hadoop/hadoop/bin/hdfs: 235: [[: not found ERROR: Cannot execute /opt/hadoop/hadoop/bin/../libexec/hdfs-config.sh.
最新发布
07-11
<think>我们正在处理一个Hadoop启动HDFS服务时出现的错误。用户执行start-dfs.sh脚本时遇到了两个主要问题: 1. 提示各种函数未找到(function not found) 2. 无法执行hdfs-config.sh脚本(Cannot execute /opt/hadoop/hadoop/bin/../libexec/hdfs-config.sh) 根据经验,这类问题通常与脚本的执行环境有关,比如脚本的权限问题、脚本的编码格式(如DOS格式与Unix格式的差异)或者环境变量配置问题。 首先,我们检查脚本的权限。如果脚本没有可执行权限,那么就会出现无法执行的错误。我们可以使用`ls -l`命令检查权限,并使用`chmod`命令添加可执行权限。 其次,检查脚本的编码格式。如果脚本是在Windows环境下编辑过,可能会包含DOS格式的换行符(CRLF)而不是Unix格式的换行符(LF),这可能导致解释器无法正确识别。我们可以使用`dos2unix`工具转换格式。 另外,函数未定义的错误可能是由于环境变量配置问题或者脚本中函数定义未被正确加载。我们需要检查hdfs-config.sh脚本中是否正确定义了这些函数,以及是否在调用前被正确加载。 具体步骤: 1. 检查并修复脚本权限: ```bash # 检查权限 ls -l /opt/hadoop/hadoop/libexec/hdfs-config.sh # 如果没有可执行权限,添加权限 chmod +x /opt/hadoop/hadoop/libexec/hdfs-config.sh ``` 2. 检查并转换脚本格式: ```bash # 检查文件类型(查看是否有CRLF) file /opt/hadoop/hadoop/libexec/hdfs-config.sh # 或者使用cat -v查看是否有^M cat -v /opt/hadoop/hadoop/libexec/hdfs-config.sh | head # 安装dos2unix(如果未安装) sudo apt-get install dos2unix # 对于Debian/Ubuntu # 转换文件 dos2unix /opt/hadoop/hadoop/libexec/hdfs-config.sh ``` 3. 检查脚本中的函数定义: 打开hdfs-config.sh,查看其中是否定义了缺失的函数(比如`hdfs_java_setup`等)。通常,这些函数应该在脚本中定义,或者通过`source`引入其他文件。检查脚本开头是否有引入其他配置文件的语句,例如: ```bash # 通常会有类似这样的语句 if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then . "${HADOOP_CONF_DIR}/hadoop-env.sh" fi ``` 确保这些配置文件存在且路径正确。 4. 检查环境变量: 确保`HADOOP_CONF_DIR`环境变量已正确设置,指向Hadoop的配置目录(通常为`/opt/hadoop/hadoop/etc/hadoop`)。可以在执行start-dfs.sh之前设置环境变量,或者检查`hadoop-env.sh`等配置文件。 5. 执行start-dfs.sh时添加调试信息: 在start-dfs.sh脚本的开头添加`set -x`,这样可以看到执行过程的详细输出,有助于定位问题。 如果以上步骤不能解决问题,可能需要查看具体的错误信息,并进一步分析。 由于用户的问题描述中提到了“各种函数未找到”,因此我们需要具体查看是哪些函数未找到。常见的函数如`hdfs_java_setup`、`hadoop_validate_env`等,这些函数通常在`hdfs-config.sh`或其他配置脚本中定义。 另外,注意Hadoop版本差异,不同版本的脚本可能略有不同。 下面,我们按照上述思路给出详细解答。</think>### Hadoop启动HDFS服务错误解决方案 当执行`start-dfs.sh`脚本出现 **"function not found"** 和 **"Cannot execute /opt/hadoop/hadoop/bin/../libexec/hdfs-config.sh"** 错误时,通常由以下原因引起: #### 1. 脚本权限问题 (最常见原因) ```bash # 检查脚本权限 (应显示 -rwxr-xr-x) ls -l /opt/hadoop/hadoop/libexec/hdfs-config.sh # 修复权限 (755表示所有者可读写执行,其他用户可读执行) sudo chmod 755 /opt/hadoop/hadoop/libexec/*.sh sudo chmod 755 /opt/hadoop/hadoop/sbin/*.sh ``` #### 2. 文件编码格式错误 (Windows/Linux换行符不兼容) ```bash # 检查文件格式 (应显示 "ASCII text",而非"with CRLF") file /opt/hadoop/hadoop/libexec/hdfs-config.sh # 安装转换工具并修复 sudo apt-get install dos2unix # Debian/Ubuntu sudo yum install dos2unix # CentOS/RHEL sudo dos2unix /opt/hadoop/hadoop/libexec/*.sh ``` #### 3. 环境变量配置错误 检查`hadoop-env.sh`中的关键配置: ```bash # 确保JAVA_HOME正确指向JDK安装路径 export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 # 检查Hadoop配置路径 export HADOOP_CONF_DIR=/opt/hadoop/hadoop/etc/hadoop ``` #### 4. 函数未定义问题 在`hdfs-config.sh`开头添加调试信息: ```bash #!/usr/bin/env bash set -x # 启用详细执行追踪 echo "Current PATH: $PATH" source "${HADOOP_CONF_DIR}/hadoop-env.sh" # 确保加载环境 ``` #### 5. 路径链接问题 检查符号链接是否有效: ```bash # 验证实际路径 ls -l /opt/hadoop/hadoop/bin/../libexec/hdfs-config.sh # 重建符号链接 (示例) cd /opt/hadoop/hadoop/bin ln -sfn ../libexec/hdfs-config.sh ``` #### 验证修复步骤 1. 修复权限和文件格式 2. 重新加载环境变量: ```bash source /opt/hadoop/hadoop/etc/hadoop/hadoop-env.sh ``` 3. 手动测试脚本: ```bash /opt/hadoop/hadoop/libexec/hdfs-config.sh ``` 4. 再次启动HDFS: ```bash start-dfs.sh ``` > **注意**:若使用非root用户操作,确保该用户对`/opt/hadoop`目录有所有权: > ```bash > sudo chown -R hadoopuser:hadoopgroup /opt/hadoop > ``` [^1]: Hadoop官方文档 - 启动故障排除指南
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值