flink1.10三节点集群之yarn-session模式

flink on yarn主要有两种运行模式。一种是内存集中管理模式(即flink yarn session模式),另一种是内存job管理模式(即single Flink job on YARN模式)。

内存集中管理模式:在Yarn中初始化一个Flink集群,开辟指定的资源,之后我们提交的Flink Jon都在这个Flink yarn-session中,也就是说不管提交多少个job,这些job都会共用开始时在yarn中申请的资源。这个Flink集群会常驻在Yarn集群中,除非手动停止。
内存Job管理模式【推荐使用】:在Yarn中,每次提交job都会创建一个新的Flink集群,任务之间相互独立,互不影响并且方便管理。任务执行完成之后创建的集群也会消失。 

对于这两种模式的具体介绍可以看这篇文章

本篇文章主要介绍第一种yarn session模式中客户端模式的集群搭建及案例测试。

提前搭建Hadoop集群环境,可以参考这篇文章

本文章的集群环境:

机器ip服务
flink1172.21.89.128 
flink2172.21.89.129 
flink3172.21.89.130 

安装配置flink

安装参考这篇文章

需要说明的是,Flink on yarn模式部署时,实际上不需要对Flink做任何修改配置,只需要将其解压传输到各个节点之上(但是如果有standalone模式启动的话,还是需要配置,因此最好配置)。但如果要实现高可用的方案,这个时候就需要到Flink相应的配置修改参数,具体的配置文件是FLINK_HOME/conf/flink-conf.yaml。

对于Flink on yarn模式,我们并不需要在conf配置下配置 masters和slaves。因为在指定TM的时候可以通过参数“-n”来标识需要启动几个TM;Flink on yarn启动后,如果是在分离式模式你会发现,在所有的节点只会出现一个 YarnSessionClusterEntrypoint进程;如果是客户端模式会出现2个进程一个YarnSessionClusterEntrypoint和一个FlinkYarnSessionCli进程。

yarn session模式的运行主要有两个步骤。第一步是启动yarn session,开辟资源,第二步是flink run运行job。

启动yarn session

对于yarn session来说,有两种启动方式,一种是客户端模式,一种是分离模式(通过-d参数来指定)。

默认可以直接执行bin/yarn-session.sh ,默认启动的配置是
 

{masterMemoryMB=1024, taskManagerMemoryMB=1024,numberTaskManagers=1, slotsPerTaskManager=1}

需要自己自定义配置的话,可以使用./bin/yarn-session.sh -help命令来查看参数: 

Usage:
   Optional
     -at,--applicationType <arg>     Set a custom application type for the application on YARN
     -D <property=value>             use value for given property
     -d,--detached                   If present, runs the job in detached mode
     -h,--help                       Help for the Yarn session CLI.
     -id,--applicationId <arg>       Attach to running YARN session
     -j,--jar <arg>                  Path to Flink jar file
     -jm,--jobManagerMemory <arg>    Memory for JobManager Container with optional unit (default: MB)
     -m,--jobmanager <arg>           Address of the JobManager (master) to which to connect. Use this flag to connect to a different JobManager than the one specified in the configuration.
     -nl,--nodeLabel <arg>           Specify YARN node label for the YARN application
     -nm,--name <arg>                Set a custom name for the application on YARN
     -q,--query                      Display available YARN resources (memory, cores)
     -qu,--queue <arg>               Specify YARN queue.
     -s,--slots <arg>                Number of slots per TaskManager
     -t,--ship <arg>                 Ship files in the specified directory (t for transfer)
     -tm,--taskManagerMemory <arg>   Memory per TaskManager Container with optional unit (default: MB)
     -yd,--yarndetached              If present, runs the job in detached mode (deprecated; use non-YARN specific option instead)
     -z,--zookeeperNamespace <arg>   Namespace to create the Zookeeper sub-paths for high availability mode
yarn-session的参数介绍
  -n : 指定TaskManager的数量;
  -d: 以分离模式运行;
  -id:指定yarn的任务ID;
  -j:Flink jar文件的路径;
  -jm:JobManager容器的内存(默认值:MB);
  -nl:为YARN应用程序指定YARN节点标签;
  -nm:在YARN上为应用程序设置自定义名称;
  -q:显示可用的YARN资源(内存,内核);
  -qu:指定YARN队列;
  -s:指定TaskManager中slot的数量;
  -st:以流模式启动Flink;
  -tm:每个TaskManager容器的内存(默认值:MB);
  -z:命名空间,用于为高可用性模式创建Zookeeper子路径;

以客户端模式启动一个yarn session(分配1g的jobmanager和4g的taskmanager)

[root@flink1 flink-1.10.0]# ./bin/yarn-session.sh -jm 1024m -tm 4096m
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/yarn/exceptions/YarnException
        at java.lang.Class.getDeclaredMethods0(Native Method)
        at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
        at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
        at java.lang.Class.getMethod0(Class.java:3018)
        at java.lang.Class.getMethod(Class.java:1784)
        at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
        at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.yarn.exceptions.YarnException
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 7 more

发现报错,这里需要在etc/profile中额外添加一个配置:

export HADOOP_CLASSPATH=`hadoop classpath`

再次运行

[root@flink1 flink-1.10.0]# ./bin/yarn-session.sh -jm 1024m -tm 4096m
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/flink-1.10.0/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-05-19 11:19:50,999 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, flink1
2020-05-19 11:19:51,001 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2020-05-19 11:19:51,001 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2020-05-19 11:19:51,001 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.memory.process.size, 1568m
2020-05-19 11:19:51,001 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 2
2020-05-19 11:19:51,001 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 2
2020-05-19 11:19:51,007 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.execution.failover-strategy, region
2020-05-19 11:19:51,008 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: io.tmp.dirs, /root/flink/tmp
2020-05-19 11:19:52,670 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2020-05-19 11:19:52,893 INFO  org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop user set to root (auth:SIMPLE)
2020-05-19 11:19:53,013 INFO  org.apache.flink.runtime.security.modules.JaasModule          - Jaas file will be created as /root/flink/tmp/jaas-114232065768094753.conf.
2020-05-19 11:19:53,049 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The configuration directory ('/opt/flink-1.10.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2020-05-19 11:19:53,338 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
2020-05-19 11:19:54,034 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=4096, slotsPerTaskManager=2}
2020-05-19 11:19:54,447 WARN  org.apache.flink.yarn.YarnClusterDescriptor                   - The file system scheme is 'file'. This indicates that the specified Hadoop configuration path is wrong and the system is using the default Hadoop configuration values.The Flink YARN client needs to store its files in a distributed file system
2020-05-19 11:19:58,746 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Submitting application master application_1589855352499_0001
2020-05-19 11:19:59,203 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1589855352499_0001
2020-05-19 11:19:59,203 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Waiting for the cluster to be allocated
2020-05-19 11:19:59,209 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deploying cluster, current state ACCEPTED
2020-05-19 11:20:00,332 ERROR org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Error while running the Flink session.
org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
        at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:380)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:548)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$5(FlinkYarnSessionCli.java:785)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:785)
Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment.
Diagnostics from YARN: Application application_1589855352499_0001 failed 1 times (global limit =2; local limit is =1) due to Error launching appattempt_1589855352499_0001_000001. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
This token is expired. current time is 1589887058435 found 1589858999379
Note: System times on machines may be out of sync. Check system time and time zones.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateExceptionImpl(SerializedExceptionPBImpl.java:171)
        at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:182)
        at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:123)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:250)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
. Failing the application.
If log aggregation is enabled on your cluster, use this command to further investigate the issue:
yarn logs -applicationId application_1589855352499_0001
        at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:999)
        at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:488)
        at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:373)
        ... 7 more

------------------------------------------------------------
 The program finished with the following exception:

org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
        at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:380)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:548)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$5(FlinkYarnSessionCli.java:785)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:785)
Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment.
Diagnostics from YARN: Application application_1589855352499_0001 failed 1 times (global limit =2; local limit is =1) due to Error launching appattempt_1589855352499_0001_000001. Got exception: org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.
This token is expired. current time is 1589887058435 found 1589858999379
Note: System times on machines may be out of sync. Check system time and time zones.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateExceptionImpl(SerializedExceptionPBImpl.java:171)
        at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:182)
        at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:123)
        at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:250)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
. Failing the application.
If log aggregation is enabled on your cluster, use this command to further investigate the issue:
yarn logs -applicationId application_1589855352499_0001
        at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:999)
        at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:488)
        at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:373)
        ... 7 more
2020-05-19 11:20:00,340 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cancelling deployment from Deployment Failure Hook
2020-05-19 11:20:00,340 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
2020-05-19 11:20:00,340 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Killing YARN application
2020-05-19 11:20:00,366 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Killed application application_1589855352499_0001
2020-05-19 11:20:00,467 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deleting files in file:/root/.flink/application_1589855352499_0001.

从日志中可以看到是机器时间不同步导致的。查看三台机器时间:

发现flink1的时间与另外两台差别太大。那么接下来进行集群时间同步。

以flink1作为时间服务器,flink2和flink3定时同步时间。可以参考这里的方式二。

再次运行:

[root@flink1 flink-1.10.0]# ./bin/yarn-session.sh -jm 1024m -tm 4096m
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/flink-1.10.0/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-05-19 11:52:58,214 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, flink1
2020-05-19 11:52:58,215 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2020-05-19 11:52:58,215 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2020-05-19 11:52:58,215 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.memory.process.size, 1568m
2020-05-19 11:52:58,215 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 2
2020-05-19 11:52:58,215 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 2
2020-05-19 11:52:58,215 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.execution.failover-strategy, region
2020-05-19 11:52:58,216 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: io.tmp.dirs, /root/flink/tmp
2020-05-19 11:52:58,940 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2020-05-19 11:52:59,059 INFO  org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop user set to root (auth:SIMPLE)
2020-05-19 11:52:59,086 INFO  org.apache.flink.runtime.security.modules.JaasModule          - Jaas file will be created as /root/flink/tmp/jaas-1783872971714302183.conf.
2020-05-19 11:52:59,106 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The configuration directory ('/opt/flink-1.10.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2020-05-19 11:52:59,207 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
2020-05-19 11:52:59,549 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=4096, slotsPerTaskManager=2}
2020-05-19 11:52:59,734 WARN  org.apache.flink.yarn.YarnClusterDescriptor                   - The file system scheme is 'file'. This indicates that the specified Hadoop configuration path is wrong and the system is using the default Hadoop configuration values.The Flink YARN client needs to store its files in a distributed file system
2020-05-19 11:53:01,249 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Submitting application master application_1589855352499_0002
2020-05-19 11:53:01,578 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1589855352499_0002
2020-05-19 11:53:01,578 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Waiting for the cluster to be allocated
2020-05-19 11:53:01,597 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deploying cluster, current state ACCEPTED
2020-05-19 11:53:02,214 ERROR org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Error while running the Flink session.
org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
        at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:380)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:548)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$5(FlinkYarnSessionCli.java:785)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:785)
Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment.
Diagnostics from YARN: Application application_1589855352499_0002 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1589855352499_0002_000001 exited with  exitCode: -1000
Failing this attempt.Diagnostics: File file:/root/.flink/application_1589855352499_0002/lib/slf4j-log4j12-1.7.15.jar does not exist
java.io.FileNotFoundException: File file:/root/.flink/application_1589855352499_0002/lib/slf4j-log4j12-1.7.15.jar does not exist
        at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:635)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:861)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:625)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
        at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
        at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

For more detailed output, check the application tracking page: http://flink1:8088/cluster/app/application_1589855352499_0002 Then click on links to logs of each attempt.
. Failing the application.
If log aggregation is enabled on your cluster, use this command to further investigate the issue:
yarn logs -applicationId application_1589855352499_0002
        at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:999)
        at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:488)
        at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:373)
        ... 7 more

------------------------------------------------------------
 The program finished with the following exception:

org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
        at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:380)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:548)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$5(FlinkYarnSessionCli.java:785)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:785)
Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment.
Diagnostics from YARN: Application application_1589855352499_0002 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1589855352499_0002_000001 exited with  exitCode: -1000
Failing this attempt.Diagnostics: File file:/root/.flink/application_1589855352499_0002/lib/slf4j-log4j12-1.7.15.jar does not exist
java.io.FileNotFoundException: File file:/root/.flink/application_1589855352499_0002/lib/slf4j-log4j12-1.7.15.jar does not exist
        at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:635)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:861)
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:625)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
        at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
        at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

For more detailed output, check the application tracking page: http://flink1:8088/cluster/app/application_1589855352499_0002 Then click on links to logs of each attempt.
. Failing the application.
If log aggregation is enabled on your cluster, use this command to further investigate the issue:
yarn logs -applicationId application_1589855352499_0002
        at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:999)
        at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:488)
        at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:373)
        ... 7 more
2020-05-19 11:53:02,222 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cancelling deployment from Deployment Failure Hook
2020-05-19 11:53:02,222 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
2020-05-19 11:53:02,223 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Killing YARN application
2020-05-19 11:53:02,232 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Killed application application_1589855352499_0002
2020-05-19 11:53:02,333 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deleting files in file:/root/.flink/application_1589855352499_0002.

说文件找不到。去对应的目录下发现确实没有文件,但不知道为什么会出现这个问题。网上找资料,说是没有export HADOOP_HOME,可是我在profile中有这个配置。尝试几次无效,索性在cli窗口中以命令行的方式执行export HADOOP_HOME=xxx,再次运行,发现不报文件问题,但报了另一个比较熟悉的问题:

[root@flink1 flink-1.10.0]# ./bin/yarn-session.sh -jm 1024m -tm 4096m
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/flink-1.10.0/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-05-19 12:06:42,684 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, flink1
2020-05-19 12:06:42,691 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2020-05-19 12:06:42,691 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2020-05-19 12:06:42,691 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.memory.process.size, 1568m
2020-05-19 12:06:42,691 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 2
2020-05-19 12:06:42,691 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 2
2020-05-19 12:06:42,692 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.execution.failover-strategy, region
2020-05-19 12:06:42,692 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: io.tmp.dirs, /root/flink/tmp
2020-05-19 12:06:43,431 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2020-05-19 12:06:43,552 INFO  org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop user set to root (auth:SIMPLE)
2020-05-19 12:06:43,576 INFO  org.apache.flink.runtime.security.modules.JaasModule          - Jaas file will be created as /root/flink/tmp/jaas-3503208428267147958.conf.
2020-05-19 12:06:43,591 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The configuration directory ('/opt/flink-1.10.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2020-05-19 12:06:43,697 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
2020-05-19 12:06:44,062 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=4096, slotsPerTaskManager=2}
2020-05-19 12:06:44,236 WARN  org.apache.flink.yarn.YarnClusterDescriptor                   - The file system scheme is 'file'. This indicates that the specified Hadoop configuration path is wrong and the system is using the default Hadoop configuration values.The Flink YARN client needs to store its files in a distributed file system
2020-05-19 12:06:44,640 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Submitting application master application_1589855352499_0003
2020-05-19 12:06:44,907 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1589855352499_0003
2020-05-19 12:06:44,907 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Waiting for the cluster to be allocated
2020-05-19 12:06:44,910 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deploying cluster, current state ACCEPTED
2020-05-19 12:06:54,455 ERROR org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Error while running the Flink session.
org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
        at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:380)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:548)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$5(FlinkYarnSessionCli.java:785)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:785)
Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment.
Diagnostics from YARN: Application application_1589855352499_0003 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1589855352499_0003_000001 exited with  exitCode: -103
Failing this attempt.Diagnostics: Container [pid=6955,containerID=container_e12_1589855352499_0003_01_000001] is running beyond virtual memory limits. Current usage: 206.6 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_e12_1589855352499_0003_01_000001 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 6971 6955 6955 6955 (java) 237 112 2189639680 52598 /opt/jdk1.8.0_191/bin/java -Xms424m -Xmx424m -Dlog.file=/opt/hadoop/hadoop-2.8.5/logs/userlogs/application_1589855352499_0003/container_e12_1589855352499_0003_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint
        |- 6955 6954 6955 6955 (bash) 0 0 115908608 304 /bin/bash -c /opt/jdk1.8.0_191/bin/java -Xms424m -Xmx424m -Dlog.file=/opt/hadoop/hadoop-2.8.5/logs/userlogs/application_1589855352499_0003/container_e12_1589855352499_0003_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint 1> /opt/hadoop/hadoop-2.8.5/logs/userlogs/application_1589855352499_0003/container_e12_1589855352499_0003_01_000001/jobmanager.out 2> /opt/hadoop/hadoop-2.8.5/logs/userlogs/application_1589855352499_0003/container_e12_1589855352499_0003_01_000001/jobmanager.err

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
For more detailed output, check the application tracking page: http://flink1:8088/cluster/app/application_1589855352499_0003 Then click on links to logs of each attempt.
. Failing the application.
If log aggregation is enabled on your cluster, use this command to further investigate the issue:
yarn logs -applicationId application_1589855352499_0003
        at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:999)
        at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:488)
        at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:373)
        ... 7 more

------------------------------------------------------------
 The program finished with the following exception:

org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
        at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:380)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:548)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$5(FlinkYarnSessionCli.java:785)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:785)
Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment.
Diagnostics from YARN: Application application_1589855352499_0003 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1589855352499_0003_000001 exited with  exitCode: -103
Failing this attempt.Diagnostics: Container [pid=6955,containerID=container_e12_1589855352499_0003_01_000001] is running beyond virtual memory limits. Current usage: 206.6 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_e12_1589855352499_0003_01_000001 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 6971 6955 6955 6955 (java) 237 112 2189639680 52598 /opt/jdk1.8.0_191/bin/java -Xms424m -Xmx424m -Dlog.file=/opt/hadoop/hadoop-2.8.5/logs/userlogs/application_1589855352499_0003/container_e12_1589855352499_0003_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint
        |- 6955 6954 6955 6955 (bash) 0 0 115908608 304 /bin/bash -c /opt/jdk1.8.0_191/bin/java -Xms424m -Xmx424m -Dlog.file=/opt/hadoop/hadoop-2.8.5/logs/userlogs/application_1589855352499_0003/container_e12_1589855352499_0003_01_000001/jobmanager.log -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint 1> /opt/hadoop/hadoop-2.8.5/logs/userlogs/application_1589855352499_0003/container_e12_1589855352499_0003_01_000001/jobmanager.out 2> /opt/hadoop/hadoop-2.8.5/logs/userlogs/application_1589855352499_0003/container_e12_1589855352499_0003_01_000001/jobmanager.err

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
For more detailed output, check the application tracking page: http://flink1:8088/cluster/app/application_1589855352499_0003 Then click on links to logs of each attempt.
. Failing the application.
If log aggregation is enabled on your cluster, use this command to further investigate the issue:
yarn logs -applicationId application_1589855352499_0003
        at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:999)
        at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:488)
        at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:373)
        ... 7 more
2020-05-19 12:06:54,507 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cancelling deployment from Deployment Failure Hook
2020-05-19 12:06:54,507 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
2020-05-19 12:06:54,508 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Killing YARN application
2020-05-19 12:06:54,516 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Killed application application_1589855352499_0003
2020-05-19 12:06:54,618 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deleting files in file:/root/.flink/application_1589855352499_0003.

报错信息显示,虚拟内存不足。这个网上有两种解决方式,我采用的是这一种。另一种参考这里

重启Hadoop集群后,再次运行:

[root@flink1 flink-1.10.0]# ./bin/yarn-session.sh -jm 1024m -tm 4096m
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/flink-1.10.0/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-05-19 12:49:40,088 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, flink1
2020-05-19 12:49:40,089 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2020-05-19 12:49:40,090 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2020-05-19 12:49:40,090 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.memory.process.size, 1568m
2020-05-19 12:49:40,090 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 2
2020-05-19 12:49:40,090 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 2
2020-05-19 12:49:40,090 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.execution.failover-strategy, region
2020-05-19 12:49:40,090 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: io.tmp.dirs, /root/flink/tmp
2020-05-19 12:49:40,945 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2020-05-19 12:49:41,062 INFO  org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop user set to root (auth:SIMPLE)
2020-05-19 12:49:41,120 INFO  org.apache.flink.runtime.security.modules.JaasModule          - Jaas file will be created as /root/flink/tmp/jaas-3327652110637179066.conf.
2020-05-19 12:49:41,126 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The configuration directory ('/opt/flink-1.10.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2020-05-19 12:49:41,651 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=4096, slotsPerTaskManager=2}
2020-05-19 12:49:52,846 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Submitting application master application_1589863760097_0001
2020-05-19 12:49:53,165 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1589863760097_0001
2020-05-19 12:49:53,165 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Waiting for the cluster to be allocated
2020-05-19 12:49:53,224 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deploying cluster, current state ACCEPTED
2020-05-19 12:50:11,392 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - YARN application has been deployed successfully.
2020-05-19 12:50:11,393 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Found Web Interface flink1:46042 of application 'application_1589863760097_0001'.
JobManager Web Interface: http://flink1:46042

看到终于启动成功了,且jobmanager是在flink1上启动的(这个jobmanager和AM运行在同一个container,因此是不确定运行在哪个机器上的)。在flink1上jps可以看到有一个YarnSessionClusterEntrypoint进行,就是此jobmanager。对于客户端模式来说,除了YarnSessionClusterEntrypoint进程外,还会启动一个FlinkYarnSessionCli进行(在哪台机器上启动yarn session,就在哪台机器上启动此进程)。

打开给出的地址http://flink1:46042
 接下来运行官方的wordcount程序:

[root@flink1 flink-1.10.0]# ./bin/flink run -m yarn-cluster -p 4 -yjm 1024m -ytm 4096m ./examples/batch/WordCount.jar
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/flink-1.10.0/lib/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-05-19 12:57:56,772 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Found Yarn properties file under /tmp/.yarn-properties-root.
2020-05-19 12:57:56,772 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Found Yarn properties file under /tmp/.yarn-properties-root.
2020-05-19 12:57:57,059 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The configuration directory ('/opt/flink-1.10.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2020-05-19 12:57:57,059 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The configuration directory ('/opt/flink-1.10.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
Executing WordCount example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
2020-05-19 12:57:57,704 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2020-05-19 12:57:57,817 WARN  org.apache.flink.yarn.YarnClusterDescriptor                   - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2020-05-19 12:57:57,861 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=4096, slotsPerTaskManager=2}
2020-05-19 12:58:21,961 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Submitting application master application_1589863760097_0002
2020-05-19 12:58:22,048 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1589863760097_0002
2020-05-19 12:58:22,048 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Waiting for the cluster to be allocated
2020-05-19 12:58:22,051 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Deploying cluster, current state ACCEPTED
2020-05-19 12:58:45,306 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - YARN application has been deployed successfully.
2020-05-19 12:58:45,481 INFO  org.apache.flink.yarn.YarnClusterDescriptor                   - Found Web Interface flink1:39042 of application 'application_1589863760097_0002'.
Job has been submitted with JobID 599ef03e3215ab0ecf60385c3320f2e6
Program execution finished
Job with JobID 599ef03e3215ab0ecf60385c3320f2e6 has finished.
Job Runtime: 64178 ms
Accumulator Results:
- 44a96fda0ca5ea318d6da45ed1d5e503 (java.util.ArrayList) [170 elements]


(after,1)
(and,12)

看yarn webui,可以看到多了一个Flink per-job cluster

再根据上面给出的地址 flink1:39042查看flink web ui:

 看到运行成功。当运行结束后,此web也会关闭。

其实这种模式相当于是在yarn上运行flink程序,因此可以使用yarn命令来关闭应用。比如这里要关闭开辟的yarn session,可以运行以下命令:

[root@flink1 flink-1.10.0]# yarn application -kill application_1589863760097_0001
Killing application application_1589863760097_0001
20/05/20 01:48:12 INFO impl.YarnClientImpl: Killed application application_1589863760097_0001

再看yarn webui,可以看到application_1589863760097_0001这个应用状态变成了killed。

关闭YarnSessionClusterEntrypoint进程之后,FlinkYarnSessionCli进程失去与YarnSessionClusterEntrypoint的链接,会自动关闭。

对于flink on yarn的介绍,可以学习一下这篇文章,很不错,其中包括分离模式的介绍。 

最权威的还是官方文档:https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/yarn_setup.html

评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值