针对之前的升级操作,研究了一下底层源码,主要是hadoop RPC相关的源码。
当我们在命令行键入hdfs dfsadmin -rollingUpgrade query的时候,底层所执行的类是DFSAdmin类,可以在 hadoop-2.7.2/bin下的hdfs文件中查看该对应关系。
DFSAdmin中,有关升级操作在内部类RollingUpgradeCommand中。主要方法是定义在run方法中。代码如下:
static int run(DistributedFileSystem dfs, String[] argv, int idx) throws IOException {
final RollingUpgradeAction action = RollingUpgradeAction.fromString(
argv.length >= 2? argv[1]: "");
if (action == null) {
throw new IllegalArgumentException("Failed to covert \"" + argv[1]
+"\" to " + RollingUpgradeAction.class.getSimpleName());
}
System.out.println(action + " rolling upgrade ...");
final RollingUpgradeInfo info = dfs.rollingUpgrade(action);
switch(action){
case QUERY:
break;
case PREPARE:
Preconditions.checkState(info.isStarted());
break;
case FINALIZE:
Preconditions.checkState(info == null || info.isFinalized());
break;
}
printMessage(info, System.out);
return 0;
}
该方法中,通过DistributeFileSystem的实例dfs,调用rollingUpgrade(action)方法。代码如下:
public RollingUpgradeInfo rollingUpgrade(RollingUpgradeAction action)
throws IOException {
return dfs.rollingUpgrade(action );
}
dfs为DFSClient类的实例,即调用了DFSClient类中的rollingUpgrade方法。DFSClient类中的rollingUpgrade方法定义代码如下:
RollingUpgradeInfo rollingUpgrade(RollingUpgradeAction action) throws IOException {
TraceScope scope = Trace.startSpan("rollingUpgrade", traceSampler);
try {
return namenode.rollingUpgrade(action);
} finally {
scope.close();
}
}
最主要的方法是namenode.rollingUpgrade()。此中namenode字段的类型为 final ClientProtocol。 ClientProtocol是一个接口,用于客户端和namenode进行交互,包含了客户端和namenode交互的所有方法。
创建DFSClient类的实例时,构造方法中也初始化了namenode的值。代码如下:
public DFSClient(URI nameNodeUri, ClientProtocol rpcNamenode,
Configuration conf, FileSystem.Statistics stats)
throws IOException {
//...
NameNodeProxies.ProxyAndInfo<ClientProtocol> proxyInfo = null;
AtomicBoolean nnFallbackToSimpleAuth = new AtomicBoolean(false);
if (numResponseToDrop > 0) {
// This case is used for testing.
LOG.warn(DFSConfigKeys.DFS_CLIENT_TEST_DROP_NAMENODE_RESPONSE_NUM_KEY
+ " is set to " + numResponseToDrop
+ ", this hacked client will proactively drop responses");
proxyInfo = NameNodeProxies.createProxyWithLossyRetryHandler(conf,
nameNodeUri, ClientProtocol.class, numResponseToDrop,
nnFallbackToSimpleAuth);
}
if (proxyInfo != null) {
this.dtService = proxyInfo.getDelegationTokenService();
this.namenode = proxyInfo.getProxy();
} else if (rpcNamenode != null) {
// This case is used for testing.
Preconditions.checkArgument(nameNodeUri == null);
this.namenode = rpcNamenode;
dtService = null;
} else {
Preconditions.checkArgument(nameNodeUri != null,
"null URI");
proxyInfo = NameNodeProxies.createProxy(conf, nameNodeUri,
ClientProtocol.class, nnFallbackToSimpleAuth);
this.dtService = proxyInfo.getDelegationTokenService();
this.namenode = proxyInfo.getProxy();
}
构造方法中,执行了NamenodeProxies.createPrxy()方法得到了ProxyInfo类型的proxyInfo实例,再调用getProxy()方法来给namenode赋值。createProxy方法代码如下:
public static <T> ProxyAndInfo<T> createProxy(Configuration conf,
URI nameNodeUri, Class<T> xface, AtomicBoolean fallbackToSimpleAuth)
throws IOException {
AbstractNNFailoverProxyProvider<T> failoverProxyProvider =
createFailoverProxyProvider(conf, nameNodeUri, xface, true,
fallbackToSimpleAuth);
if (failoverProxyProvider == null) {
// Non-HA case
return createNonHAProxy(conf, NameNode.getAddress(nameNodeUri), xface,
UserGroupInformation.getCurrentUser(), true, fallbackToSimpleAuth);
} else {
// HA case
Conf config = new Conf(conf);
T proxy = (T) RetryProxy.create(xface, failoverProxyProvider,
RetryPolicies.failoverOnNetworkException(
RetryPolicies.TRY_ONCE_THEN_FAIL, config.maxFailoverAttempts,
config.maxRetryAttempts, config.failoverSleepBaseMillis,
config.failoverSleepMaxMillis));
Text dtService;
if (failoverProxyProvider.useLogicalURI()) {
dtService = HAUtil.buildTokenServiceForLogicalUri(nameNodeUri,
HdfsConstants.HDFS_URI_SCHEME);
} else {
dtService = SecurityUtil.buildTokenService(
NameNode.getAddress(nameNodeUri));
}
return new ProxyAndInfo<T>(proxy, dtService,
NameNode.getAddress(nameNodeUri));
}
}
首先先通过调用createFailoverProxyProvider方法得到一个AbstractNNFailoverProxyProvider对象。该方法最终返回的是ConfiguredFailoverProxyProvider的实例,这两个类为父子类关系。该方法会去读取hdfs-site.xml文件中dfs.client.failover.proxy.provider.mycluster 项的值,得到ConfiguredFailoverProxyProvider类。可以从下图中看到failoverProxyProvider字段确实是ConfiguredFailoverProxyProvider的实例。
如果failoverProxyProvider不为空,则判断集群做了HA否则没有做HA。
对于做了HA的集群,则会执行RetryProxy.create()操作。执行代码如下:
public static <T> Object create (Class<T> iface ,
FailoverProxyProvider<T> proxyProvider, RetryPolicy retryPolicy ) {
return Proxy. newProxyInstance(
proxyProvider.getInterface().getClassLoader(),
new Class<?>[] { iface },
new RetryInvocationHandler<T>(proxyProvider, retryPolicy)
);
}
运用了动态代理技术。代理即为RetryInvocationHandler类。该类构造方法代码如下:
protected RetryInvocationHandler(FailoverProxyProvider<T> proxyProvider,
RetryPolicy defaultPolicy,
Map<String, RetryPolicy> methodNameToPolicyMap) {
this. proxyProvider = proxyProvider ;
this. defaultPolicy = defaultPolicy ;
this. methodNameToPolicyMap = methodNameToPolicyMap ;
this. currentProxy = proxyProvider .getProxy ();
}
而invoke()方法中调用了invokeMethod方法,代码如下:
protected Object invokeMethod(Method method, Object[] args) throws Throwable {
try {
if (!method.isAccessible()) {
method.setAccessible(true);
}
return method.invoke(currentProxy.proxy, args);
} catch (InvocationTargetException e) {
throw e.getCause();
}
}
这里看到代理的对象即为currentProxy.proxy。而currentProxy在构造方法中已经给了初始化方式即proxyProvider.getProxy(),而proxyProvider正是ConfiguredFailoverProxyProvider类的实例,该getProxy()方法代码如下:
public synchronized ProxyInfo<T> getProxy () {
AddressRpcProxyPair<T> current = proxies .get(currentProxyIndex);
if ( current.namenode == null) {
try {
current.namenode = NameNodeProxies.createNonHAProxy (conf ,
current.address , xface, ugi, false , fallbackToSimpleAuth).getProxy();
} catch (IOException e ) {
LOG.error("Failed to create RPC proxy to NameNode" , e);
throw new RuntimeException(e);
}
}
return new ProxyInfo<T>(current .namenode, current.address .toString());
}
调试代码跟踪发现,此处进入了createNonHAProxy方法。而该方法最终返回的就是ClientNamenodeProtocolTranslatorPB的实例。一层一层反推过后,DFSClient类中的namenode字段即为ClientNamenodeProtocolTranslatorPB代理实例。
下图为调试DFSClient类中调用的NameNodeProxies类中方法createProxy()得到的proxy值,代理为RetryInvocationHadnler,而真实对象就是ClientNamnodeProtocolTranslatorPB
所以我们在命令行执行的操作,最终执行的是ClientNamenodeProtocolTranslatorPB类中对应的方法。该类实现了ClientProtocol接口,且内部有一个ClientNamenodeProtocolPB对象。该对象也是通过动态代理机制获得的(通过RPC.getProtocolProxy方法获得),且该代理对象内部封装了一个ProtobufRpcEngine.Invoker对象,对ClientNamenodeProtocolPB接口的调用都会由Invoker对象中的invoke()方法实现。DFSClient通过namenode字段调用了ClientNamenodeProtocolTranslatorPB类中的方法,而ClientNamenodeProtocolTranslatorPB类中相应的方法又通过ClientNamenodeProtocolPB对象调用了ClientNamenodeProtocolPB中的方法。而之前说到该对象是代理对象,调用该对象方法时,会由代理类的invoke()方法执行,invoke()方法会在底层调用Client.call()方法,将请求发送给Server。
由此,客户端发送请求部分执行结束。