hive遇到的问题:The ownership on the staging directory /tmp/hadoop-yarnis not as expectedThe directory mus

本文记录了一次在使用Hive过程中遇到的权限问题及其解决过程。主要问题是当使用root用户执行Hive查询时,因目录所有者为hadoop而导致权限不足。通过调整目录所有者和权限,最终成功执行了查询。

搭建好了hive,由于切换到hadoop无法进入hive,权限不足,于是切换到root,执行:

hive history file=/tmp/hadoop/hive_job_log_hadoop_201407010908_503942368.txt
hive>
hive>select count(*) from test;
后出现:

java.io.IOException: The ownership on the staging directory /tmp/hadoop-yarn/staging/root/.staging is not as expected. It is owned by hive. The directory must be owned by the submitter root or by root
	at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:120)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:146)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:421)
如上省略了一些,这问题提示是需要root权限,退出hive,执行:

hadoop fs -chown -R root:root /tmp
然而并未修改成功:

chmod: changing permissions of '/tmp/hadoop-yarn': Permission denied: user=root, access=EXECUTE, inode="/tmp":hive:hive:drwxrwx--

切换成hadoop用户,执行上面的修改权限指令,成功!原因。。。。是因为root没有权限修改hadoop?因为我的所有权是hadoop的,这问题待解答。。

上面命令修改后:结果如下:

hive> SELECT COUNT(*) FROM test;
Query ID = root_20180128162626_f875d94a-9ada-47c0-8af8-b9b37271c64a
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1517036563281_0005, Tracking URL = http://master:8088/proxy/application_1517036563281_0005/
Kill Command = /usr/local/hadoop/bin/hadoop job  -kill job_1517036563281_0005
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2018-01-28 16:26:58,698 Stage-1 map = 0%,  reduce = 0%
2018-01-28 16:27:06,402 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.32 sec
2018-01-28 16:27:13,780 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2.05 sec
MapReduce Total cumulative CPU time: 2 seconds 50 msec
Ended Job = job_1517036563281_0005
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 2.05 sec   HDFS Read: 6436 HDFS Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 50 msec
OK
0
Time taken: 33.905 seconds, Fetched: 1 row(s)






<think>我们正在解决Flume1.9.0在Hadoop3.3.0和Hive3.1.3环境下监听端口报错的问题。根据之前的分析,核心问题是Guava库版本冲突。以下为详细解决方案:###一、解决Guava库版本冲突Hadoop3.3.0依赖Guava27.0,而Flume1.9.0自带Guava11.0.2,导致冲突[^1]。操作步骤如下:```bash#1.删除Flume旧版Guavarm$FLUME_HOME/lib/guava-*.jar#2.复制Hadoop3.3.0的Guava到Flumecp$HADOOP_HOME/share/hadoop/common/lib/guava-27.0-jre.jar$FLUME_HOME/lib/```###二、配置环境变量在`$FLUME_HOME/conf/flume-env.sh`中添加:```bashexportJAVA_HOME=/path/to/jdk8#必须JDK8exportHADOOP_HOME=/export/servers/hadoop-HA/hadoop-3.3.0exportHIVE_HOME=/export/servers/hive-3.1.3#如需Hive集成```###三、Hive集成特殊配置Hive3.1.3需要额外处理HiveStreaming依赖:1.复制Hive依赖到Flume的lib目录:```bashcp$HIVE_HOME/lib/hive-exec-3.1.3.jar$FLUME_HOME/lib/cp$HIVE_HOME/lib/hive-metastore-3.1.3.jar$FLUME_HOME/lib/cp$HIVE_HOME/lib/libfb303-0.9.3.jar$FLUME_HOME/lib/#Thrift依赖```2.在`flume-conf.properties`中配置HiveSink:```propertiesagent.sinks.hive-sink.type=hiveagent.sinks.hive-sink.hive.metastore=thrift://hivemetastore-host:9083agent.sinks.hive-sink.hive.database=defaultagent.sinks.hive-sink.hive.table=logsagent.sinks.hive-sink.serializer=DELIMITED#按实际数据格式选择```###四、端口监听配置示例(NetCatSource)```propertiesagent.sources=netcat-sourceagent.sources.netcat-source.type=netcatagent.sources.netcat-source.bind=0.0.0.0agent.sources.netcat-source.port=44444agent.sources.netcat-source.channels=memory-channelagent.channels=memory-channelagent.channels.memory-channel.type=memoryagent.sinks=logger-sinkagent.sinks.logger-sink.type=loggeragent.sinks.logger-sink.channel=memory-channel```###五、启动验证1.启动Flume:```bash$FLUME_HOME/bin/flume-ngagent\--conf$FLUME_HOME/conf\--conf-file$FLUME_HOME/conf/flume-conf.properties\--nameagent-Dflume.root.logger=INFO,console```2.测试端口监听:```bashecho"testdata"|nclocalhost44444```成功时Flume控制台应输出:```Event:{headers:{}body:746573742064617461testdata}```>**关键点**:若需写入HDFS/Hive,确保Flume进程有对应权限(Kerberos或简单模式下的用户权限)[^2]。§§相关问题§§1.Flume连接HiveMetastore时出现Thrift异常如何解决?2.如何配置Flume的高可用机制避免单点故障?3.Flume内存Channel的调优参数有哪些?</think>### Flume 1.9.0 监听端口报错解决方案(Hadoop 3.3.0 + Hive 3.1.3) #### 核心问题分析 日志显示Flume已加载环境配置,但未显示具体错误。结合Hadoop 3.3.0环境,主要需解决: 1. **Guava版本冲突**:Hadoop 3.3.0依赖Guava 27+,Flume 1.9.0自带Guava 11.0.2 2. **Hive集成问题**:Hive 3.1.3的依赖兼容性 3. **端口监听配置**:NetCat Source的正确设置 --- ### 完整解决步骤 #### 1. 解决库冲突(关键步骤) ```bash # 删除Flume自带的旧版Guava rm $FLUME_HOME/lib/guava-11.0.2.jar # 复制Hadoop 3.3.0的Guava到Flume cp $HADOOP_HOME/share/hadoop/common/lib/guava-27.0-jre.jar $FLUME_HOME/lib/ ``` > **验证**:检查`$FLUME_HOME/lib/`目录下仅存在`guava-27.0-jre.jar`[^1] #### 2. 环境变量配置(`flume-env.sh`) ```bash # 设置Java路径(必须JDK8) export JAVA_HOME=/path/to/jdk8 # 指向您的Hadoop 3.3.0安装 export HADOOP_HOME=/export/servers/hadoop-HA/hadoop-3.3.0 # 添加Hive 3.1.3支持(如需Hive Sink) export HIVE_HOME=/path/to/hive-3.1.3 export FLUME_CLASSPATH=$HIVE_HOME/lib/*:$HADOOP_HOME/etc/hadoop ``` #### 3. Hive 3.1.3特殊配置 ```bash # 复制关键Hive库到Flume cp $HIVE_HOME/lib/hive-exec-3.1.3.jar $FLUME_HOME/lib/ cp $HIVE_HOME/lib/libfb303-0.9.3.jar $FLUME_HOME/lib/ # Thrift依赖 ``` #### 4. 端口监听配置示例(`flume-conf.properties`) ```properties # NetCat Source配置 agent.sources = netcat-source agent.sources.netcat-source.type = netcat agent.sources.netcat-source.bind = 0.0.0.0 agent.sources.netcat-source.port = 44444 agent.sources.netcat-source.channels = mem-channel # 内存Channel配置 agent.channels = mem-channel agent.channels.mem-channel.type = memory agent.channels.mem-channel.capacity = 1000 # Logger Sink(测试用) agent.sinks = logger-sink agent.sinks.logger-sink.type = logger agent.sinks.logger-sink.channel = mem-channel ``` #### 5. 启动命令 ```bash $FLUME_HOME/bin/flume-ng agent \ --conf $FLUME_HOME/conf \ --conf-file $FLUME_HOME/conf/flume-conf.properties \ --name agent \ -Dflume.root.logger=INFO,console ``` #### 6. 测试端口连接 ```bash # 发送测试数据 echo "Hello Flume" | nc localhost 44444 # 成功时控制台应显示: # Event: { headers:{} body: 48 65 6C 6C 6F 20 46 6C 75 6D 65 Hello Flume } ``` --- ### 高级配置说明 1. **HDFS Sink集成**(需在`flume-conf.properties`中添加): ```properties agent.sinks.hdfs-sink.type = hdfs agent.sinks.hdfs-sink.hdfs.path = hdfs://namenode:8020/flume/%Y%m%d agent.sinks.hdfs-sink.hdfs.fileType = DataStream ``` 2. **Kerberos认证支持**: ```bash # 在flume-env.sh中添加 export JAVA_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf" ``` 3. **堆内存调整**(`flume-env.sh`): ```bash export JAVA_OPTS="-Xms1g -Xmx2g" ``` > **监控建议**:使用`netstat -tunlp | grep 44444`验证端口监听状态
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值