Hadoop_Configuration

本文深入探讨了Hadoop中Configuration类的作用及其内部配置参数的意义。Configuration类负责存储所有Hadoop配置参数,其他类需要通过该类获取配置信息来完成对Hadoop的操作。

    刚开始接触HadoopAPI,首先第一个就是接触Configuration这个类,但是不知道为什么很多的其它方法都需要这个Configuration对象作为参数传入,今天好好的看了这个方法。其中最重要的莫过于它将用户对Hadoop所做的配置全部记录下来了。下面是我通过调用Configuration类中的Main方法而产生出来的xml数据。

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<configuration>
<property><name>fs.s3n.impl</name><value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value></property>
<property><name>mapred.task.cache.levels</name><value>2</value></property>
<property><name>hadoop.tmp.dir</name><value>/tmp/hadoop-${user.name}</value></property>
<property><name>hadoop.native.lib</name><value>true</value></property>
<property><name>map.sort.class</name><value>org.apache.hadoop.util.QuickSort</value></property>
<property><name>ipc.client.idlethreshold</name><value>4000</value></property>
<property><name>mapred.system.dir</name><value>${hadoop.tmp.dir}/mapred/system</value></property>
<property><name>mapred.job.tracker.persist.jobstatus.hours</name><value>0</value></property>
<property><name>io.skip.checksum.errors</name><value>false</value></property>
<property><name>fs.default.name</name><value>hdfs://localhost:8020/</value></property>
<property><name>mapred.cluster.reduce.memory.mb</name><value>-1</value></property>
<property><name>mapred.child.tmp</name><value>./tmp</value></property>
<property><name>fs.har.impl.disable.cache</name><value>true</value></property>
<property><name>mapred.skip.reduce.max.skip.groups</name><value>0</value></property>
<property><name>mapred.heartbeats.in.second</name><value>100</value></property>
<property><name>mapred.tasktracker.dns.nameserver</name><value>default</value></property>
<property><name>io.sort.factor</name><value>10</value></property>
<property><name>mapred.task.timeout</name><value>600000</value></property>
<property><name>mapred.max.tracker.failures</name><value>4</value></property>
<property><name>hadoop.rpc.socket.factory.class.default</name><value>org.apache.hadoop.net.StandardSocketFactory</value></property>
<property><name>mapred.job.tracker.jobhistory.lru.cache.size</name><value>5</value></property>
<property><name>fs.hdfs.impl</name><value>org.apache.hadoop.hdfs.DistributedFileSystem</value></property>
<property><name>eclipse.plug-in.jobtracker.port</name><value>8021</value></property>
<property><name>mapred.skip.map.auto.incr.proc.count</name><value>true</value></property>
<property><name>mapreduce.job.complete.cancel.delegation.tokens</name><value>true</value></property>
<property><name>io.mapfile.bloom.size</name><value>1048576</value></property>
<property><name>mapreduce.reduce.shuffle.connect.timeout</name><value>180000</value></property>
<property><name>mapred.jobtracker.blacklist.fault-timeout-window</name><value>180</value></property>
<property><name>tasktracker.http.threads</name><value>40</value></property>
<property><name>mapred.job.shuffle.merge.percent</name><value>0.66</value></property>
<property><name>fs.ftp.impl</name><value>org.apache.hadoop.fs.ftp.FTPFileSystem</value></property>
<property><name>mapred.output.compress</name><value>false</value></property>
<property><name>io.bytes.per.checksum</name><value>512</value></property>
<property><name>mapred.healthChecker.script.timeout</name><value>600000</value></property>
<property><name>topology.node.switch.mapping.impl</name><value>org.apache.hadoop.net.ScriptBasedMapping</value></property>
<property><name>mapred.reduce.slowstart.completed.maps</name><value>0.05</value></property>
<property><name>mapred.reduce.max.attempts</name><value>4</value></property>
<property><name>fs.ramfs.impl</name><value>org.apache.hadoop.fs.InMemoryFileSystem</value></property>
<property><name>mapred.skip.map.max.skip.records</name><value>0</value></property>
<property><name>mapred.cluster.map.memory.mb</name><value>-1</value></property>
<property><name>hadoop.security.group.mapping</name><value>org.apache.hadoop.security.ShellBasedUnixGroupsMapping</value></property>
<property><name>mapred.job.tracker.persist.jobstatus.dir</name><value>/jobtracker/jobsInfo</value></property>
<property><name>mapred.jar</name><value>/home/hadoop/hadoop/.metadata/.plugins/org.apache.hadoop.eclipse/firstApp_CatURL.java-7377097405794968176.jar</value></property>
<property><name>fs.s3.buffer.dir</name><value>${hadoop.tmp.dir}/s3</value></property>
<property><name>job.end.retry.attempts</name><value>0</value></property>
<property><name>fs.file.impl</name><value>org.apache.hadoop.fs.LocalFileSystem</value></property>
<property><name>mapred.local.dir.minspacestart</name><value>0</value></property>
<property><name>mapred.output.compression.type</name><value>RECORD</value></property>
<property><name>topology.script.number.args</name><value>100</value></property>
<property><name>io.mapfile.bloom.error.rate</name><value>0.005</value></property>
<property><name>mapred.cluster.max.reduce.memory.mb</name><value>-1</value></property>
<property><name>mapred.max.tracker.blacklists</name><value>4</value></property>
<property><name>mapred.task.profile.maps</name><value>0-2</value></property>
<property><name>mapred.userlog.retain.hours</name><value>24</value></property>
<property><name>mapred.job.tracker.persist.jobstatus.active</name><value>false</value></property>
<property><name>hadoop.security.authorization</name><value>false</value></property>
<property><name>local.cache.size</name><value>10737418240</value></property>
<property><name>eclipse.plug-in.jobtracker.host</name><value>localhost</value></property>
<property><name>mapred.min.split.size</name><value>0</value></property>
<property><name>mapred.map.tasks</name><value>2</value></property>
<property><name>mapred.child.java.opts</name><value>-Xmx200m</value></property>
<property><name>mapreduce.job.counters.limit</name><value>120</value></property>
<property><name>eclipse.plug-in.user.name</name><value>hadoop</value></property>
<property><name>mapred.job.queue.name</name><value>default</value></property>
<property><name>mapred.job.tracker.retiredjobs.cache.size</name><value>1000</value></property>
<property><name>ipc.server.listen.queue.size</name><value>128</value></property>
<property><name>mapred.inmem.merge.threshold</name><value>1000</value></property>
<property><name>job.end.retry.interval</name><value>30000</value></property>
<property><name>mapred.skip.attempts.to.start.skipping</name><value>2</value></property>
<property><name>fs.checkpoint.dir</name><value>${hadoop.tmp.dir}/dfs/namesecondary</value></property>
<property><name>mapred.reduce.tasks</name><value>1</value></property>
<property><name>mapred.merge.recordsBeforeProgress</name><value>10000</value></property>
<property><name>mapred.userlog.limit.kb</name><value>0</value></property>
<property><name>mapred.job.reduce.memory.mb</name><value>-1</value></property>
<property><name>webinterface.private.actions</name><value>false</value></property>
<property><name>io.sort.spill.percent</name><value>0.80</value></property>
<property><name>mapred.job.shuffle.input.buffer.percent</name><value>0.70</value></property>
<property><name>eclipse.plug-in.socks.proxy.port</name><value>1080</value></property>
<property><name>mapred.map.tasks.speculative.execution</name><value>true</value></property>
<property><name>hadoop.util.hash.type</name><value>murmur</value></property>
<property><name>mapred.map.max.attempts</name><value>4</value></property>
<property><name>mapreduce.job.acl-view-job</name><value> </value></property>
<property><name>mapred.job.tracker.handler.count</name><value>10</value></property>
<property><name>mapreduce.reduce.shuffle.read.timeout</name><value>180000</value></property>
<property><name>mapred.tasktracker.expiry.interval</name><value>600000</value></property>
<property><name>mapred.jobtracker.maxtasks.per.job</name><value>-1</value></property>
<property><name>mapred.jobtracker.job.history.block.size</name><value>3145728</value></property>
<property><name>keep.failed.task.files</name><value>false</value></property>
<property><name>ipc.client.tcpnodelay</name><value>false</value></property>
<property><name>mapred.task.profile.reduces</name><value>0-2</value></property>
<property><name>mapred.output.compression.codec</name><value>org.apache.hadoop.io.compress.DefaultCodec</value></property>
<property><name>io.map.index.skip</name><value>0</value></property>
<property><name>ipc.server.tcpnodelay</name><value>false</value></property>
<property><name>mapred.jobtracker.blacklist.fault-bucket-width</name><value>15</value></property>
<property><name>mapred.used.genericoptionsparser</name><value>true</value></property>
<property><name>mapred.job.map.memory.mb</name><value>-1</value></property>
<property><name>hadoop.logfile.size</name><value>10000000</value></property>
<property><name>mapred.reduce.tasks.speculative.execution</name><value>true</value></property>
<property><name>mapreduce.tasktracker.outofband.heartbeat</name><value>false</value></property>
<property><name>mapreduce.reduce.input.limit</name><value>-1</value></property>
<property><name>hadoop.security.authentication</name><value>simple</value></property>
<property><name>eclipse.plug-in.socks.proxy.host</name><value>host</value></property>
<property><name>fs.checkpoint.period</name><value>3600</value></property>
<property><name>mapred.job.reuse.jvm.num.tasks</name><value>1</value></property>
<property><name>mapred.jobtracker.completeuserjobs.maximum</name><value>100</value></property>
<property><name>mapred.task.tracker.task-controller</name><value>org.apache.hadoop.mapred.DefaultTaskController</value></property>
<property><name>fs.s3.maxRetries</name><value>4</value></property>
<property><name>mapred.cluster.max.map.memory.mb</name><value>-1</value></property>
<property><name>mapreduce.job.acl-modify-job</name><value> </value></property>
<property><name>mapred.local.dir</name><value>${hadoop.tmp.dir}/mapred/local</value></property>
<property><name>fs.hftp.impl</name><value>org.apache.hadoop.hdfs.HftpFileSystem</value></property>
<property><name>fs.trash.interval</name><value>0</value></property>
<property><name>fs.s3.sleepTimeSeconds</name><value>10</value></property>
<property><name>mapred.submit.replication</name><value>10</value></property>
<property><name>fs.har.impl</name><value>org.apache.hadoop.fs.HarFileSystem</value></property>
<property><name>mapred.map.output.compression.codec</name><value>org.apache.hadoop.io.compress.DefaultCodec</value></property>
<property><name>mapred.tasktracker.dns.interface</name><value>default</value></property>
<property><name>eclipse.plug-in.namenode.port</name><value>8020</value></property>
<property><name>mapred.job.tracker</name><value>localhost:8021</value></property>
<property><name>io.seqfile.sorter.recordlimit</name><value>1000000</value></property>
<property><name>mapred.line.input.format.linespermap</name><value>1</value></property>
<property><name>mapred.jobtracker.taskScheduler</name><value>org.apache.hadoop.mapred.JobQueueTaskScheduler</value></property>
<property><name>eclipse.plug-in.masters.colocate</name><value>yes</value></property>
<property><name>mapred.local.dir.minspacekill</name><value>0</value></property>
<property><name>io.sort.record.percent</name><value>0.05</value></property>
<property><name>fs.kfs.impl</name><value>org.apache.hadoop.fs.kfs.KosmosFileSystem</value></property>
<property><name>mapred.temp.dir</name><value>${hadoop.tmp.dir}/mapred/temp</value></property>
<property><name>mapred.tasktracker.reduce.tasks.maximum</name><value>2</value></property>
<property><name>eclipse.plug-in.socks.proxy.enable</name><value>no</value></property>
<property><name>fs.checkpoint.edits.dir</name><value>${fs.checkpoint.dir}</value></property>
<property><name>mapred.tasktracker.tasks.sleeptime-before-sigkill</name><value>5000</value></property>
<property><name>eclipse.plug-in.location.name</name><value>localhost</value></property>
<property><name>mapred.job.reduce.input.buffer.percent</name><value>0.0</value></property>
<property><name>mapred.tasktracker.indexcache.mb</name><value>10</value></property>
<property><name>mapreduce.job.split.metainfo.maxsize</name><value>10000000</value></property>
<property><name>mapred.skip.reduce.auto.incr.proc.count</name><value>true</value></property>
<property><name>hadoop.logfile.count</name><value>10</value></property>
<property><name>io.seqfile.compress.blocksize</name><value>1000000</value></property>
<property><name>fs.s3.block.size</name><value>67108864</value></property>
<property><name>mapred.tasktracker.taskmemorymanager.monitoring-interval</name><value>5000</value></property>
<property><name>mapreduce.jobtracker.staging.root.dir</name><value>${hadoop.tmp.dir}/mapred/staging</value></property>
<property><name>mapred.acls.enabled</name><value>false</value></property>
<property><name>mapred.queue.default.state</name><value>RUNNING</value></property>
<property><name>mapred.queue.names</name><value>default</value></property>
<property><name>fs.hsftp.impl</name><value>org.apache.hadoop.hdfs.HsftpFileSystem</value></property>
<property><name>mapred.task.tracker.http.address</name><value>0.0.0.0:50060</value></property>
<property><name>mapred.reduce.parallel.copies</name><value>5</value></property>
<property><name>io.seqfile.lazydecompress</name><value>true</value></property>
<property><name>eclipse.plug-in.namenode.host</name><value>localhost</value></property>
<property><name>io.sort.mb</name><value>100</value></property>
<property><name>ipc.client.connection.maxidletime</name><value>10000</value></property>
<property><name>mapred.task.tracker.report.address</name><value>127.0.0.1:0</value></property>
<property><name>mapred.compress.map.output</name><value>false</value></property>
<property><name>hadoop.security.uid.cache.secs</name><value>14400</value></property>
<property><name>mapred.healthChecker.interval</name><value>60000</value></property>
<property><name>ipc.client.kill.max</name><value>10</value></property>
<property><name>ipc.client.connect.max.retries</name><value>10</value></property>
<property><name>fs.s3.impl</name><value>org.apache.hadoop.fs.s3.S3FileSystem</value></property>
<property><name>mapred.user.jobconf.limit</name><value>5242880</value></property>
<property><name>hadoop.socks.server</name><value>host:1080</value></property>
<property><name>mapred.job.tracker.http.address</name><value>0.0.0.0:50030</value></property>
<property><name>io.file.buffer.size</name><value>4096</value></property>
<property><name>mapred.jobtracker.restart.recover</name><value>false</value></property>
<property><name>io.serializations</name><value>org.apache.hadoop.io.serializer.WritableSerialization</value></property>
<property><name>mapred.task.profile</name><value>false</value></property>
<property><name>mapred.reduce.copy.backoff</name><value>300</value></property>
<property><name>jobclient.output.filter</name><value>FAILED</value></property>
<property><name>mapred.tasktracker.map.tasks.maximum</name><value>2</value></property>
<property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value></property>
<property><name>fs.checkpoint.size</name><value>67108864</value></property>
</configuration>
 

    根节点是configuration,也就代表了Configuration类了,至于里面的属性都是我们对Hadoop的配置参数,很多都是Hadoop默认的配置。所以其他的类如果想要对你配置的Hadoop做操作,当然离不开这些参数了。而如果你想往Hadoop中配置其他资源时,还可以通过Configuration中的set方法进行操作,而如果你想取里面的某些参数,同理也可以调用Configuration中的get方法。

 

@echo off @rem Licensed to the Apache Software Foundation (ASF) under one or more @rem contributor license agreements. See the NOTICE file distributed with @rem this work for additional information regarding copyright ownership. @rem The ASF licenses this file to You under the Apache License, Version 2.0 @rem (the "License"); you may not use this file except in compliance with @rem the License. You may obtain a copy of the License at @rem @rem http://www.apache.org/licenses/LICENSE-2.0 @rem @rem Unless required by applicable law or agreed to in writing, software @rem distributed under the License is distributed on an "AS IS" BASIS, @rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. @rem See the License for the specific language governing permissions and @rem limitations under the License. @rem Set Hadoop-specific environment variables here. @rem The only required environment variable is JAVA_HOME. All others are @rem optional. When running a distributed configuration it is best to @rem set JAVA_HOME in this file, so that it is correctly defined on @rem remote nodes. @rem The java implementation to use. Required. set JAVA_HOME=%JAVA_HOME% @rem The jsvc implementation to use. Jsvc is required to run secure datanodes. @rem set JSVC_HOME=%JSVC_HOME% @rem set HADOOP_CONF_DIR= @rem Extra Java CLASSPATH elements. Automatically insert capacity-scheduler. if exist %HADOOP_HOME%\contrib\capacity-scheduler ( if not defined HADOOP_CLASSPATH ( set HADOOP_CLASSPATH=%HADOOP_HOME%\contrib\capacity-scheduler\*.jar ) else ( set HADOOP_CLASSPATH=%HADOOP_CLASSPATH%;%HADOOP_HOME%\contrib\capacity-scheduler\*.jar ) ) @rem The maximum amount of heap to use, in MB. Default is 1000. @rem set HADOOP_HEAPSIZE= @rem set HADOOP_NAMENODE_INIT_HEAPSIZE="" @rem Extra Java runtime options. Empty by default. @rem set HADOOP_OPTS=%HADOOP_OPTS% -Djava.net.preferIPv4Stack=true @rem Command specific options appended to HADOOP_OPTS when specified if not defined HADOOP_SECURITY_LOGGER ( set HADOOP_SECURITY_LOGGER=INFO,RFAS ) if not defined HDFS_AUDIT_LOGGER ( set HDFS_AUDIT_LOGGER=INFO,NullAppender ) set HADOOP_NAMENODE_OPTS=-Dhadoop.security.logger=%HADOOP_SECURITY_LOGGER% -Dhdfs.audit.logger=%HDFS_AUDIT_LOGGER% %HADOOP_NAMENODE_OPTS% set HADOOP_DATANODE_OPTS=-Dhadoop.security.logger=ERROR,RFAS %HADOOP_DATANODE_OPTS% set HADOOP_SECONDARYNAMENODE_OPTS=-Dhadoop.security.logger=%HADOOP_SECURITY_LOGGER% -Dhdfs.audit.logger=%HDFS_AUDIT_LOGGER% %HADOOP_SECONDARYNAMENODE_OPTS% @rem The following applies to multiple commands (fs, dfs, fsck, distcp etc) set HADOOP_CLIENT_OPTS=-Xmx512m %HADOOP_CLIENT_OPTS% @rem set HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData %HADOOP_JAVA_PLATFORM_OPTS%" @rem On secure datanodes, user to run the datanode as after dropping privileges set HADOOP_SECURE_DN_USER=%HADOOP_SECURE_DN_USER% @rem Where log files are stored. %HADOOP_HOME%/logs by default. @rem set HADOOP_LOG_DIR=%HADOOP_LOG_DIR%\%USERNAME% @rem Where log files are stored in the secure data environment. set HADOOP_SECURE_DN_LOG_DIR=%HADOOP_LOG_DIR%\%HADOOP_HDFS_USER% @rem @rem Router-based HDFS Federation specific parameters @rem Specify the JVM options to be used when starting the RBF Routers. @rem These options will be appended to the options specified as HADOOP_OPTS @rem and therefore may override any similar flags set in HADOOP_OPTS @rem @rem set HADOOP_DFSROUTER_OPTS="" @rem @rem The directory where pid files are stored. /tmp by default. @rem NOTE: this should be set to a directory that can only be written to by @rem the user that will run the hadoop daemons. Otherwise there is the @rem potential for a symlink attack. set HADOOP_PID_DIR=%HADOOP_PID_DIR% set HADOOP_SECURE_DN_PID_DIR=%HADOOP_PID_DIR% @rem A string representing this instance of hadoop. %USERNAME% by default. set HADOOP_IDENT_STRING=%USERNAME% 这个hadoophadoop-env.cmd该怎么改
最新发布
06-04
<think>我们有一个Hadoop的配置文件hadoop-env.cmd(Windows批处理格式)。用户询问如何修改这个文件。根据提供的文件内容,我们看到这是一个设置Hadoop环境变量的脚本。常见的修改包括:1.设置JAVA_HOME:这是必须的,因为Hadoop依赖Java。2.设置Hadoop堆内存大小(HADOOP_HEAPSIZE)。3.设置其他Java运行时选项(HADOOP_OPTS)。4.设置日志目录(HADOOP_LOG_DIR)等。根据用户的问题,没有具体说明要修改什么,因此我们需要给出常见的修改点及示例。注意:这个文件是Windows批处理格式(.cmd),所以语法是批处理语法。举例说明一些常见的修改:1.设置JAVA_HOME:在文件中已经有一行:`setJAVA_HOME=%JAVA_HOME%`这表示它使用了系统环境变量中的JAVA_HOME。但是,如果系统环境变量中没有设置,或者你想指定一个不同的JDK路径,可以直接设置,例如:```batchsetJAVA_HOME=C:\Java\jdk1.8.0_291```2.设置Hadoop堆内存大小:默认是注释掉的,可以取消注释并设置值,例如设置堆内存为2000MB:```batchsetHADOOP_HEAPSIZE=2000```3.设置HADOOP_OPTS(例如,增加PermSize或者最大堆内存,或者设置垃圾回收等):默认是注释掉的,可以取消注释并添加,例如:```batchsetHADOOP_OPTS=%HADOOP_OPTS%-Xmx2000m-XX:+UseG1GC```4.修改日志目录:```batchsetHADOOP_LOG_DIR=C:\hadoop\logs```5.设置HDFS的用户,如果运行在安全模式下(需要设置HADOOP_SECURE_DN_USER):```batchsetHADOOP_SECURE_DN_USER=hdfs```注意:修改后需要重启Hadoop服务以使配置生效。下面给出一个修改示例,假设我们要设置JAVA_HOME、堆内存、日志目录和增加一些JVM选项:原始部分内容:setJAVA_HOME=%JAVA_HOME%@remsetHADOOP_HEAPSIZE=@remsetHADOOP_OPTS=%HADOOP_OPTS%-Djava.net.preferIPv4Stack=true@remsetHADOOP_LOG_DIR=%HADOOP_LOG_DIR%\%USERNAME%修改后:setJAVA_HOME=C:\Java\jdk1.8.0_291setHADOOP_HEAPSIZE=2000setHADOOP_OPTS=%HADOOP_OPTS%-Djava.net.preferIPv4Stack=true-Xms1024m-Xmx2048msetHADOOP_LOG_DIR=C:\hadoop_logs注意:在Windows中,路径使用反斜杠,并且不要使用引号(除非路径中包含空格,则必须使用引号,但一般建议避免空格)。如果路径中有空格,则应该使用引号,例如:setJAVA_HOME="C:\ProgramFiles\Java\jdk1.8.0_291"但是,在批处理中,使用带空格的路径有时会引起问题,所以最好安装在没有空格的路径下。此外,修改配置文件前请备份原始文件。由于用户没有具体说明修改目的,我们只能给出一般性的修改指导。最后,根据用户要求,我们需要在回答后提出相关问题。</think> 针对修改 Hadoop 的 `hadoop-env.cmd` 配置文件的建议,以下是关键修改点和示例代码: ### 1. 配置 `JAVA_HOME`(必需) ```batch :: 替换为实际 Java 路径 (去掉行首的@rem 使其生效) set JAVA_HOME=C:\Java\jdk1.8.0_301 ``` - **说明**:需设置实际 JDK 安装路径,否则 Hadoop 无法运行 ### 2. 配置 HDFS 日志目录 ```batch :: 设置日志存储目录 set HADOOP_LOG_DIR=C:\hadoop-logs ``` - **说明**:避免使用 `%USERNAME%` 防止权限问题 ### 3. 优化堆内存配置 ```batch :: NameNode 初始化堆内存 (大型集群推荐 4GB) set HADOOP_NAMENODE_INIT_HEAPSIZE=4096 :: DataNode 堆内存配置 (默认 1GB,推荐 2-4GB) set HADOOP_DATANODE_HEAPSIZE=2048 ``` ### 4. 开启 JMX 监控 ```batch :: 添加 JMX 监控端口 set HADOOP_NAMENODE_OPTS=%HADOOP_NAMENODE_OPTS% ^ -Dcom.sun.management.jmxremote ^ -Dcom.sun.management.jmxremote.port=9010 ``` ### 5. 安全启动配置(仅限安全模式) ```batch :: 设置安全数据节点用户 set HADOOP_SECURE_DN_USER=hdfsadmin :: 安全模式 PID 存储路径 set HADOOP_SECURE_DN_PID_DIR=C:\hadoop\secure-pids ``` ### 完整修改示例 ```batch @echo off :: ===== 核心配置 ===== set JAVA_HOME=C:\Java\jdk1.8.0_301 set HADOOP_LOG_DIR=C:\hadoop-logs set HADOOP_PID_DIR=C:\hadoop-pids :: ===== 内存配置 ===== set HADOOP_NAMENODE_INIT_HEAPSIZE=4096 set HADOOP_DATANODE_HEAPSIZE=2048 :: ===== 高级参数 ===== set HADOOP_OPTS=%HADOOP_OPTS% ^ -Djava.net.preferIPv4Stack=true ^ -XX:+UseG1GC :: ===== NameNode 监控 ===== set HADOOP_NAMENODE_OPTS=%HADOOP_NAMENODE_OPTS% ^ -Dcom.sun.management.jmxremote ^ -Dcom.sun.management.jmxremote.port=9010 :: ===== 安全配置 ===== set HADOOP_SECURE_DN_USER=hdfsadmin set HADOOP_SECURE_DN_PID_DIR=C:\hadoop\secure-pids :: 以下原始配置保持不变... ``` ### 🛠️ 修改注意事项: 1. **路径规范**: - 使用非空格路径(如 `C:\Java` 而非 `Program Files`) - 所有目录需提前创建并赋权 2. **生效方式**: - 修改后重启所有 Hadoop 服务 - 使用 `hadoop checknative` 验证配置 3. **版本兼容**: - JDK 1.8+ (低于此版本需修改 `hadoop-env` 的GC参数) - Windows 需安装 [winutils](https://github.com/steveloughran/winutils)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值