Hadoop源码阅读-HDFS-day1

深入解析HDFS的声明与构造函数

最新推荐文章于 2024-01-16 14:20:51 发布

转载最新推荐文章于 2024-01-16 14:20:51 发布 · 57 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/nashiyue/p/5327302.html

文章标签：

#大数据

本文详细解读了HDFS的声明及其构造函数，包括初始化过程、参数配置和异常处理，帮助开发者理解如何正确地创建和使用HDFS文件系统。

HDFS声明及构造函数

 1 @InterfaceAudience.Private
 2 @InterfaceStability.Evolving
 3 public class Hdfs extends AbstractFileSystem {
 4 
 5   DFSClient dfs;
 6   final CryptoCodec factory;
 7   private boolean verifyChecksum = true;
 8 
 9   static {
10     HdfsConfiguration.init();
11   }
12 
13   /**
14    * This constructor has the signature needed by
15    * {@link AbstractFileSystem#createFileSystem(URI, Configuration)}
16    * 
17    * @param theUri which must be that of Hdfs
18    * @param conf configuration
19    * @throws IOException
20    */
21   Hdfs(final URI theUri, final Configuration conf) throws IOException, URISyntaxException {
22     super(theUri, HdfsConstants.HDFS_URI_SCHEME, true, NameNode.DEFAULT_PORT);
23 
24     if (!theUri.getScheme().equalsIgnoreCase(HdfsConstants.HDFS_URI_SCHEME)) {
25       throw new IllegalArgumentException("Passed URI's scheme is not for Hdfs");
26     }
27     String host = theUri.getHost();
28     if (host == null) {
29       throw new IOException("Incomplete HDFS URI, no host: " + theUri);
30     }
31 
32     this.dfs = new DFSClient(theUri, conf, getStatistics());
33     this.factory = CryptoCodec.getInstance(conf);
34   }

Hdfs继承了AbstractFileSystem这个抽象类，其中有一个静态块，执行HdfsConfiguration的初始化方法,我们先来看下这个方法

/**
   * This method is here so that when invoked, HdfsConfiguration is class-loaded if
   * it hasn't already been previously loaded.  Upon loading the class, the static 
   * initializer block above will be executed to add the deprecated keys and to add
   * the default resources.   It is safe for this method to be called multiple times 
   * as the static initializer block will only get invoked once.
   * 
   * This replaces the previously, dangerous practice of other classes calling
   * Configuration.addDefaultResource("hdfs-default.xml") directly without loading 
   * HdfsConfiguration class first, thereby skipping the key deprecation
   */
  public static void init() {
  }

init是一个空函数，它存在的作用，只是为了类加载，当类加载的时候，静态块中的方法将会执行从而增加过时的Key和Resource。因为它本身是一个空函数，它被反复调用时安全的，因为静态块中的方法只会被调用一次，使用这个方法来替代之前不安全的直接类调用Configuration.addDefaultResource("hdfs-default.xml")方法。所以，他的静态块中肯定会包含这个方法，我们来看下是不是这样.

static {
    addDeprecatedKeys();

    // adds the default resources
    Configuration.addDefaultResource("hdfs-default.xml");
    Configuration.addDefaultResource("hdfs-site.xml");

  }

看完这些，回到HDFS的父类AbstractFileSystem

/**
 * This class provides an interface for implementors of a Hadoop file system
 * (analogous to the VFS of Unix). Applications do not access this class;
 * instead they access files across all file systems using {@link FileContext}.
 * 
 * Pathnames passed to AbstractFileSystem can be fully qualified URI that
 * matches the "this" file system (ie same scheme and authority) 
 * or a Slash-relative name that is assumed to be relative
 * to the root of the "this" file system .
 */
@InterfaceAudience.Public
@InterfaceStability.Evolving /*Evolving for a release,to be changed to Stable */
public abstract class AbstractFileSystem {

AbstractFileSystem提供了一个实现Hadoop 文件系统的接口，应用访问文件通过FileContext而不需要访问这个类，路径名称一旦匹配了这个文件系统，就会视为合法的URI,否则的会认为是该文件系统根目录的相对路径？（这个不太确定）

来看下它的构造函数：

/**
   * Constructor to be called by subclasses.
   * 
   * @param uri for this file system.
   * @param supportedScheme the scheme supported by the implementor
   * @param authorityNeeded if true then theURI must have authority, if false
   *          then the URI must have null authority.
   *
   * @throws URISyntaxException <code>uri</code> has syntax error
   */
  public AbstractFileSystem(final URI uri, final String supportedScheme,
      final boolean authorityNeeded, final int defaultPort)
      throws URISyntaxException {
    myUri = getUri(uri, supportedScheme, authorityNeeded, defaultPort);
    statistics = getStatistics(uri); 
  }

/**
   * Get the URI for the file system based on the given URI. The path, query
   * part of the given URI is stripped out and default file system port is used
   * to form the URI.
   * 
   * @param uri FileSystem URI.
   * @param authorityNeeded if true authority cannot be null in the URI. If
   *          false authority must be null.
   * @param defaultPort default port to use if port is not specified in the URI.
   * 
   * @return URI of the file system
   * 
   * @throws URISyntaxException <code>uri</code> has syntax error
   */
  private URI getUri(URI uri, String supportedScheme,
      boolean authorityNeeded, int defaultPort) throws URISyntaxException {
    checkScheme(uri, supportedScheme);
    // A file system implementation that requires authority must always
    // specify default port
    if (defaultPort < 0 && authorityNeeded) {
      throw new HadoopIllegalArgumentException(
          "FileSystem implementation error -  default port " + defaultPort
              + " is not valid");
    }
    String authority = uri.getAuthority();
    if (authority == null) {
       if (authorityNeeded) {
         throw new HadoopIllegalArgumentException("Uri without authority: " + uri);
       } else {
         return new URI(supportedScheme + ":///");
       }   
    }
    // authority is non null  - AuthorityNeeded may be true or false.
    int port = uri.getPort();
    port = (port == -1 ? defaultPort : port);
    if (port == -1) { // no port supplied and default port is not specified
      return new URI(supportedScheme, authority, "/", null);
    }
    return new URI(supportedScheme + "://" + uri.getHost() + ":" + port);
  }