问题日志
问题1
启动Flink集群时无法找到JAVA_HOME
root@hadoop01:~# /home/hadoop/app/flink-1.9.1/bin/start-cluster.sh
Starting HA cluster with 2 masters.
Please specify JAVA_HOME. Either in Flink config ./conf/flink-conf.yaml or as system-wide JAVA_HOME.
Please specify JAVA_HOME. Either in Flink config ./conf/flink-conf.yaml or as system-wide JAVA_HOME.
Please specify JAVA_HOME. Either in Flink config ./conf/flink-conf.yaml or as system-wide JAVA_HOME.
Please specify JAVA_HOME. Either in Flink config ./conf/flink-conf.yaml or as system-wide JAVA_HOME.
Please specify JAVA_HOME. Either in Flink config ./conf/flink-conf.yaml or as system-wide JAVA_HOME.
问题1解决方法
在Flink的conf/flink-conf.yaml内添加你的JAVA_HOME路径
vim /home/hadoop/app/flink-1.9.1/conf/flink-conf.yaml
添加JAVA_HONE位置
(这个冒号后的空格不要删除)
env.java.home: /home/hadoop/app/jdk1.8.0_411
FinalShell的显示
VMware内的显示
正确修改后的env.java.home:颜色会不一样
问题2
这个问题是因为高可用设置,flink在连接到hdfs时找不到mycluster。
2024-12-03 13:38:12,221 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Could not start cluster entrypoint StandaloneSessionClusterEntrypoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint StandaloneSessionClusterEntrypoint.
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:182)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:501)
at org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:65)
Caused by: java.io.IOException: Could not create FileSystem for highly available storage (high-availability.storageDir)
at org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:119)
at org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:92)
at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:120)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:292)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:257)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:202)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:163)
... 2 more
Caused by: java.io.IOException: Cannot instantiate file system for URI: hdfs://mycluster/flink/ha
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:187)
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:443)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:359)
at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
at org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:116)
... 13 more
Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: mycluster
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:417)
at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:132)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:359)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:293)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:157)
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:159)
... 17 more
Caused by: java.net.UnknownHostException: mycluster
... 23 more
问题2解决方法
将hadoop的hdfs-site.xml与core-site.xml复制到flink的conf目录下
cp /home/hadoop/app/hadoop-2.9.1/etc/hadoop/hdfs-site.xml /home/hadoop/app/flink-1.9.1/conf
cp /home/hadoop/app/hadoop-2.9.1/etc/hadoop/core-site.xml /home/hadoop/app/flink-1.9.1/conf
2.
首先如果我们想要将Flink与Hadoop一起使用我们需要下载Flink对应Haoop版本的shaded包。
我的是flink-shaded-hadoop-2-uber-2.8.3-7.0.jar
把flink-shaded-hadoop-2-uber-2.8.3-7.0.jar添加到Flink的lib目录下
把flink-shaded-hadoop-2-uber-2.8.3-7.0.jar添加到Hadoop环境变量中
vim /home/hadoop/app/hadoop-2.9.1/etc/hadoop/hadoop-env.sh
export HADOOP_CLASSPATH=/home/hadoop/app/flink-1.9.1/lib/flink-shaded-hadoop-2-uber-2.8.3-7.0.jar:$HADOOP_CLASSPATH
使hadoop的环境变量重新生效
source /home/hadoop/app/hadoop-2.9.1/etc/hadoop/hadoop-env.sh
3.
在Flink的conf/flink-conf.yaml内添加你的Hadoop的/etc/hadoop路径
env.hadoop.conf.dir: /home/hadoop/app/hadoop-2.9.1/etc/hadoop
日志
按照解决方法操作后,以hadoop01节点日志为例
可以看到日志已经从原先的error成功启动了,记得是要去看新的日志,从错误中重新启动后会生成新的日志。
Flink集群启动过程
root@hadoop01:~# /home/hadoop/app/apache-zookeeper-3.9.1-bin/bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /home/hadoop/app/apache-zookeeper-3.9.1-bin/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
root@hadoop01:~# /home/hadoop/app/hadoop-2.9.1/sbin/start-dfs.sh
Starting namenodes on [hadoop01 hadoop02]
hadoop01: starting namenode, logging to /home/hadoop/app/hadoop-2.9.1/logs/hadoop-root-namenode-hadoop01.out
hadoop02: starting namenode, logging to /home/hadoop/app/hadoop-2.9.1/logs/hadoop-root-namenode-hadoop02.out
hadoop02: starting datanode, logging to /home/hadoop/app/hadoop-2.9.1/logs/hadoop-root-datanode-hadoop02.out
hadoop01: starting datanode, logging to /home/hadoop/app/hadoop-2.9.1/logs/hadoop-root-datanode-hadoop01.out
hadoop03: starting datanode, logging to /home/hadoop/app/hadoop-2.9.1/logs/hadoop-root-datanode-hadoop03.out
Starting journal nodes [hadoop01 hadoop02 hadoop03]
hadoop01: starting journalnode, logging to /home/hadoop/app/hadoop-2.9.1/logs/hadoop-root-journalnode-hadoop01.out
hadoop02: starting journalnode, logging to /home/hadoop/app/hadoop-2.9.1/logs/hadoop-root-journalnode-hadoop02.out
hadoop03: starting journalnode, logging to /home/hadoop/app/hadoop-2.9.1/logs/hadoop-root-journalnode-hadoop03.out
Starting ZK Failover Controllers on NN hosts [hadoop01 hadoop02]
hadoop02: starting zkfc, logging to /home/hadoop/app/hadoop-2.9.1/logs/hadoop-root-zkfc-hadoop02.out
hadoop01: starting zkfc, logging to /home/hadoop/app/hadoop-2.9.1/logs/hadoop-root-zkfc-hadoop01.out
root@hadoop01:~# /home/hadoop/app/flink-1.9.1/bin/start-cluster.sh
Starting HA cluster with 2 masters.
Starting standalonesession daemon on host hadoop01.
Starting standalonesession daemon on host hadoop02.
Starting taskexecutor daemon on host hadoop01.
Starting taskexecutor daemon on host hadoop02.
Starting taskexecutor daemon on host hadoop03.
root@hadoop01:~# jps
4960 Jps
1922 QuorumPeerMain
2515 DataNode
2868 JournalNode
2329 NameNode
3242 DFSZKFailoverController
4172 StandaloneSessionClusterEntrypoint
4700 TaskManagerRunner
root@hadoop01:~#
web页面访问
http://192.168.88.147:8081