=============================== 集 群 规 划 ================================
Hadoop011 NameNode DataNode Hive Hmaster Master
Hadoop012 ResourceManager NodeManager DataNode HRegionServer Worker
Hadoop013 DataNode NodeManager ZooKeeper HRegionServer Worker
Hadoop014 DataNode NodeManager ZooKeeper HRegionServer Worker
Hadoop015 DataNode NodeManager MySQL ZooKeeper HRegionServer Worker
===========================================================================
1、将下载好的安装包移到虚拟机 hadoop011 并解压
Alt+p 上传:
sftp> put G:\08_spark\03_安装包\spark-2.1.1-bin-hadoop2.7.tgz
将文件包从Home目录移到 /opt/soft/下:
[root@hadoop011 ~]# mv spark-2.1.1-bin-hadoop2.7.tgz /opt/soft/
解压到 /opt/app/:
[root@hadoop011 soft]# tar zxvf spark-2.1.1-bin-hadoop2.7.tgz -C /opt/app/
重命名为 spark-2.1.1:
[root@hadoop011 app]# mv spark-2.1.1-bin-hadoop2.7 spark-2.1.1
2、配置相关文件
1)进入 /opt/app/spark-2.1.1/conf 目录下:
将 slaves.template 重命名为 slaves
[root@hadoop011 conf]# mv slaves.template slaves
[root@hadoop011 conf]# vim slaves
在 slaves 中添加 hostname:
hadoop012
hadoop013
hadoop014
将 spark-env.sh.template 重命名为 spark-env.sh
[root@hadoop011 conf]# mv spark-env.sh.template spark-env.sh
在 spark-env.sh 中添加如下配置:
SPARK_MASTER_HOST=hadoop011
SPARK_MASTER_PORT=7077
2)在 /opt/app/spark-2.1.1/sbin 目录下的 spark-config.sh 中添加下面配置:
export JAVA_HOME=/opt/app/jdk1.8.0_131
3、配置Job History Server
1)将 spark-defaults.conf.template 重命名为 spark-defaults.conf
[root@hadoop011 conf]# mv spark-defaults.conf.template spark-defaults.conf
[root@hadoop011 conf]# vim spark-defaults.conf
修改 spark-defaults.conf,添加如下配置:
# Example:
spark.master spark://hadoop011:7077
spark.eventLog.enabled true
spark.eventLog.dir hdfs://hadoop011:9000/directory
# spark.serializer org.apache.spark.serializer.KryoSerializer
# spark.driver.memory 5g
2)修改 spark-env.sh
[root@hadoop011 conf]# vim spark-env.sh
添加如下配置:
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=4000
-Dspark.history.retainedApplications=3
-Dspark.history.fs.logDirectory=hdfs://hadoop011:9000/directory"
根据配置在集群上创建一个 /directory 目录
[root@hadoop011 sbin]# hadoop dfs -mkdir /directory
4. 将虚拟机hadoop011配置好的文件分发其它虚拟机
[root@hadoop011 app]# scp -r spark-2.1.1/ root@hadoop012:/opt/app/
[root@hadoop011 app]# scp -r spark-2.1.1/ root@hadoop013:/opt/app/
[root@hadoop011 app]# scp -r spark-2.1.1/ root@hadoop014:/opt/app/
[root@hadoop011 app]# scp -r spark-2.1.1/ root@hadoop015:/opt/app/
5.启动集群之后,启动 Spark
[root@hadoop011 sbin]# ./start-all.sh
6. 查看进程
[root@hadoop011 sbin]# jps
3812 Jps
2808 DataNode
2810 NodeManager
3740 Master
2652 NameNode
[root@hadoop013 app]# jps
2544 NodeManager
2769 SecondaryNameNode
2546 DataNode
2980 Worker
3014 Jps
7. 启动 Spark History Server
[root@hadoop011 sbin]# ./start-history-server.sh
8. 尝试在 /opt/app/spark-2.1.1/bin 目录下执行第一个程序
./spark-submit \
--class org.apache.spark.examples.SparkPi \
--master spark://hadoop011:7077 \
--executor-memory 1G \
--total-executor-cores 2 \
/opt/app/spark-2.1.1/examples/jars/spark-examples_2.11-2.1.1.jar \
100