Spark提交任务方法有三种
1.StandAlone模式下提交
2.Spark On Yarn - client模式下提交
3.Spark On Yarn - cluster模式下提交
网上有许多这三种方法的原理,我总结一下使用这三种方法要用的代码,供大家参考
1.StandAlone:
bin/spark-submit \
--class day05.SparkWC2 \
--master spark://master:7077,slave1:7077,slave2:7077 \
--executor-memory 1G \
--total-executor-cores 2 \
examples/jars/SparkWC2.jar
1.在spark-submit下启动driver
2.类方法全名
3.master 可能在的节点 高可用
4.executor进程启动初始化内存(配置里有 可省略)
5.总核心数(配置里有 可省略)
6.jar包
2.SparkOnYarn-Client (控制台可见输出结果)
全:
bin/spark-submit \
--class day07.Demo.Demo.LacTimeDemo \
--master yarn-client \
--executor-memory 1G \
--total-executor-cores 1 \
/home/hadoop/install/spark/examples/jars/maven_scala.jar \
hdfs://192.168.22.80:9000/data/jztl.txt hdfs://192.168.22.80:9000/res/jztl
简:
bin/spark-submit \
--class day07.Demo.Demo.LacTimeDemo \
--master yarn-client \
/home/hadoop/install/spark/examples/jars/maven_scala.jar
2.SparkOnYarn-Cluster (8088可见输出结果)
bin/spark-submit \
--class day07.Demo.Demo.LacTimeDemo \
--master yarn-cluster \
/home/hadoop/install/spark/examples/jars/maven_scala.jar