Spark2.3.0 Standalone Mode

本文介绍如何将Apache Spark独立部署到集群中,包括手动启动集群、使用启动脚本、配置资源等关键步骤。此外还介绍了如何启动master和worker,以及如何通过web UI监控集群状态。

参看文档:http://spark.apache.org/docs/latest/spark-standalone.html

Spark Standalone Mode

In addition to running on the Mesos or YARN cluster managers, Spark also provides a simple standalone deploy mode. You can launch a standalone cluster either manually, by starting a master and workers by hand, or use our provided launch scripts. It is also possible to run these daemons on a single machine for testing.

Spark独立模式

  • 将Spark独立安装到集群
  • 手动启动一个集群
  • 集群启动脚本
  • 将应用程序连接的集群
  • 启动Spark应用程序
  • 资源调度
  • 执行器调度
  • 监控和日志
  • 运行在Hadoop旁边
  • 配置网络安全
  • 高可用
  •     使用Zookeeper实现主备
  •     本地文件系统的单节点恢复

除了运行在Mesos或YARN集群管理器外,Spark也提供了一个简单的独立部署模式。你可以手动启动一个独立的集群,通过手动启动一个master和workers或者使用我们提供的启动脚本。也可以在一台机器上运行这些守护进程进行测试。

Installing Spark Standalone to a Cluster

To install Spark Standalone mode, you simply place a compiled version of Spark on each node on the cluster. You can obtain pre-built versions of Spark with each release or build it yourself.

为了安装Spark独立模式,你只需在集群的每个节点上放置一个编译版本的Spark。你可以在每个版本中获得预先构建的版本或者自己构建。

Starting a Cluster Manually

You can start a standalone master server by executing:

./sbin/start-master.sh

Once started, the master will print out a spark://HOST:PORT URL for itself, which you can use to connect workers to it, or pass as the “master” argument to SparkContext. You can also find this URL on the master’s web UI, which is http://localhost:8080 by default.

Similarly, you can start one or more workers and connect them to the master via:

./sbin/start-slave.sh <master-spark-URL>

Once you have started a worker, look at the master’s web UI (http://localhost:8080 by default). You should see the new node listed there, along with its number of CPUs and memory (minus one gigabyte left for the OS).

Finally, the following configuration options can be passed to the master and worker:

ArgumentMeaning
-h HOST--host HOSTHostname to listen on
-i HOST--ip HOSTHostname to listen on (deprecated, use -h or --host)
-p PORT--port PORTPort for service to listen on (default: 7077 for master, random for worker)
--webui-port PORTPort for web UI (default: 8080 for master, 8081 for worker)
-c CORES--cores CORESTotal CPU cores to allow Spark applications to use on the machine (default: all available); only on worker
-m MEM--memory MEMTotal amount of memory to allow Spark applications to use on the machine, in a format like 1000M or 2G (default: your machine's total RAM minus 1 GB); only on worker
-d DIR--work-dir DIRDirectory to use for scratch space and job output logs (default: SPARK_HOME/work); only on worker
--properties-file FILEPath to a custom Spark properties file to load (default: conf/spark-defaults.conf)

你可以启动一个独立的master,通过执行:

./sbin/start-master.sh







评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值