Spark on Yarn 两种方式的区别及工作流程
cluster mode: Spark Driver runs inside an application master process
managed by YARN on the master启动spark 应用程序后 客户端可以关掉
集群模式:Spark Driver运行在application master 进程中,而这个进程在集群中受YARN的管理
启动应用程序后,客户端可以关闭
例如:在idea开发完业务代码,打包到服务器上
./bin/spark-submit --class org.apache.spark.examples.SparkPi
–master yarn
–deploy-mode cluster
–driver-memory 4g
–executor-memory 2g
–executor-cores 1
–queue thequeue
examples/jars/spark-examples*.jar
10
client mode: driver runs in the client 进程中,应用master 仅仅是向YARN 申请资源
./bin/spark-shell --master yarn --deploy-mode client
添加其它的jars
./bin/spark-submit --class my.main.Class \
--master yarn \
--deploy-mode cluster \
--jars my-other-jar.jar,my-other-other-jar.jar \
my-main-jar.jar \
app_arg1 app_arg2 // 参数在外面传进来