概念
spark cluster consist
一个spark集群包括一个driver + N个executor
- driver
- executor
启动方式(部署方式)
不同的启动方式,意味着driver和executor不同的分布方式
- local:
- cluster:
-
- cluster-client: sparksubmit -mode client,driver运行在物理集群之外,executor在物理集群之内,由resource manager管理
-
- cluster-cluster: sparksubmit -mode cluster,driver以及executor均运行在物理集群内,由resource manager管理
集群资源管理
- Standalone – a simple cluster manager included with Spark that makes it easy to set up a cluster.
- Apache Mesos – a general cluster manager that can also run Hadoop MapReduce and service applications.
- Hadoop YARN – the resource manager in Hadoop 2.
- Kubernetes – an open-source system for automating deployment, scaling, and management of containerized applications.