Hadoop_DataNode_代码分析(1)

本文介绍了HDFS中StorageInfo类的基本概念,它是NameNode和DataNode文件目录结构共有的信息,包括版本、命名空间ID及创建时间等。进一步讨论了Storage类如何管理本地存储目录及其信息,并解释了DataStorage类在DataNode中的具体实现。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

(1)StorageInfo类是通用存储信息的类,通用的意思是NameNode和DataNode上面文件目录结构共有的信息,这些信息包括layoutVersion(应用软件的版本和文件目录的版本一致),namespaceID(Storage的ID,作用还不清楚),cTime(就是创建时间),这些信息显然是非易失性的,应该放在目录的VERSION文件中。

(2)从Storage类的注释中可知,Local storage可以有多个目录,每个目录有相同的VERSION文件,其中保存Local storage information,包括(type of the node, the storage layout version,the namespace id, the fs state creation time)。服务程序(DataNode或者NameNode,因为Storage对两者通用)启动时会对所有目录上锁。StorageDirectory是Storage类中的内部类,用来表示上述的一个目录。

(3)DataStorage类继承Storage用于DataNode。Block(BlockID,generationStamp数据块的版本号,numBytes数据块的大小)。DataBlockInfo类用来包含对Block的操作而不仅仅抽象属性,其中FSVolume表示Block所在的目录(Storage.StorageDirectory)。当然还有一个File对象指向实际的Block,detachBlock把DataBlockInfo中的blockdetach掉,使得File可以安全读,因为file硬链接可能大于一,所以需要复制,在Windows下面没有明白这个函数的代码是如何执行的,模拟的结果是错的。但是Linux下面应该是正确。

@echo off @rem Licensed to the Apache Software Foundation (ASF) under one or more @rem contributor license agreements. See the NOTICE file distributed with @rem this work for additional information regarding copyright ownership. @rem The ASF licenses this file to You under the Apache License, Version 2.0 @rem (the "License"); you may not use this file except in compliance with @rem the License. You may obtain a copy of the License at @rem @rem http://www.apache.org/licenses/LICENSE-2.0 @rem @rem Unless required by applicable law or agreed to in writing, software @rem distributed under the License is distributed on an "AS IS" BASIS, @rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. @rem See the License for the specific language governing permissions and @rem limitations under the License. @rem Set Hadoop-specific environment variables here. @rem The only required environment variable is JAVA_HOME. All others are @rem optional. When running a distributed configuration it is best to @rem set JAVA_HOME in this file, so that it is correctly defined on @rem remote nodes. @rem The java implementation to use. Required. set JAVA_HOME=%JAVA_HOME% @rem The jsvc implementation to use. Jsvc is required to run secure datanodes. @rem set JSVC_HOME=%JSVC_HOME% @rem set HADOOP_CONF_DIR= @rem Extra Java CLASSPATH elements. Automatically insert capacity-scheduler. if exist %HADOOP_HOME%\contrib\capacity-scheduler ( if not defined HADOOP_CLASSPATH ( set HADOOP_CLASSPATH=%HADOOP_HOME%\contrib\capacity-scheduler\*.jar ) else ( set HADOOP_CLASSPATH=%HADOOP_CLASSPATH%;%HADOOP_HOME%\contrib\capacity-scheduler\*.jar ) ) @rem The maximum amount of heap to use, in MB. Default is 1000. @rem set HADOOP_HEAPSIZE= @rem set HADOOP_NAMENODE_INIT_HEAPSIZE="" @rem Extra Java runtime options. Empty by default. @rem set HADOOP_OPTS=%HADOOP_OPTS% -Djava.net.preferIPv4Stack=true @rem Command specific options appended to HADOOP_OPTS when specified if not defined HADOOP_SECURITY_LOGGER ( set HADOOP_SECURITY_LOGGER=INFO,RFAS ) if not defined HDFS_AUDIT_LOGGER ( set HDFS_AUDIT_LOGGER=INFO,NullAppender ) set HADOOP_NAMENODE_OPTS=-Dhadoop.security.logger=%HADOOP_SECURITY_LOGGER% -Dhdfs.audit.logger=%HDFS_AUDIT_LOGGER% %HADOOP_NAMENODE_OPTS% set HADOOP_DATANODE_OPTS=-Dhadoop.security.logger=ERROR,RFAS %HADOOP_DATANODE_OPTS% set HADOOP_SECONDARYNAMENODE_OPTS=-Dhadoop.security.logger=%HADOOP_SECURITY_LOGGER% -Dhdfs.audit.logger=%HDFS_AUDIT_LOGGER% %HADOOP_SECONDARYNAMENODE_OPTS% @rem The following applies to multiple commands (fs, dfs, fsck, distcp etc) set HADOOP_CLIENT_OPTS=-Xmx512m %HADOOP_CLIENT_OPTS% @rem set HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData %HADOOP_JAVA_PLATFORM_OPTS%" @rem On secure datanodes, user to run the datanode as after dropping privileges set HADOOP_SECURE_DN_USER=%HADOOP_SECURE_DN_USER% @rem Where log files are stored. %HADOOP_HOME%/logs by default. @rem set HADOOP_LOG_DIR=%HADOOP_LOG_DIR%\%USERNAME% @rem Where log files are stored in the secure data environment. set HADOOP_SECURE_DN_LOG_DIR=%HADOOP_LOG_DIR%\%HADOOP_HDFS_USER% @rem @rem Router-based HDFS Federation specific parameters @rem Specify the JVM options to be used when starting the RBF Routers. @rem These options will be appended to the options specified as HADOOP_OPTS @rem and therefore may override any similar flags set in HADOOP_OPTS @rem @rem set HADOOP_DFSROUTER_OPTS="" @rem @rem The directory where pid files are stored. /tmp by default. @rem NOTE: this should be set to a directory that can only be written to by @rem the user that will run the hadoop daemons. Otherwise there is the @rem potential for a symlink attack. set HADOOP_PID_DIR=%HADOOP_PID_DIR% set HADOOP_SECURE_DN_PID_DIR=%HADOOP_PID_DIR% @rem A string representing this instance of hadoop. %USERNAME% by default. set HADOOP_IDENT_STRING=%USERNAME% 这个hadoophadoop-env.cmd该怎么改
最新发布
06-04
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值