Spark Standalone Mode

本文详细介绍了如何在Ubuntu 12.04上使用三台机器快速安装Spark集群,包括安装Java、Scala、Spark,设置SSH,配置主机名,以及启动集群的方法。

It is very easy to install a Spark cluster (Standalone mode). In my example, I used three machines.

All machines run a OS of ubuntu 12.04 32bit. One machine is named "master", the other two are

named "node01" and "node02" respectively. The name of a machine can be set in:  /etc/hostname.

Further more, every nodes (machines) should the same user name.

 

1. On every node: Install Java and set Java environment in ~/.bashrc as:

  #set java environment

  export JAVA_HOME=/usr/local/jdk1.7.0_67

  export JRE_HOME=$JAVA_HOME/jre

  export PATH=$JAVA_HOME/bin:$PATH

  export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib

   Note that in my example, I used Java jdk1.7.0_67 and put it under /usr/local.

2. On every node: Install Scala and set corresponding environment variables in ~/.bashrc as:

       export SCALA_HOME=/usr/local/scala-2.10.4

       export PATH=$SCALA_HOME/bin:$PATH

    Note that in my example, I used Scala scala-2.10.4 and put it under /usr/local.

3. On every node: Install Spark.

    Download any version of Spark from http://spark.apache.org/downloads.html , in my example, I

    chose spark-1.1.0-bin-hadoop2.4.tgz and extract it to /usr/local.

    Set in ~/.bashrc:

        export SPARK_HOME=/usr/local/spark-1.1.0-bin-hadoop2.4

4. Set up ssh such that every two nodes in the cluster can ssh each other without password. This step

    is also needed when you set up a hadoop cluster, there are abundant tutorials on the Internet, so

    the details is omitted here.

5. On every node:

  $ sudo vim /etc/hosts

    and set the IP address of the nodes in the network. For example, I set the hosts file on every node to:

  127.0.0.1        localhost

  223.3.86.xxx  master

  223.3.81.xxx  node01

  223.3.70.xxx  node02

6. On master node: Enter the root folder of Spark, and edit con/slaves. In my example:

  $ cd /usr/local/spark-1.1.0-bin-hadoop2.4

  $ sudo vim conf/slaves

     Edit slaves file to:

  master

  node01

  node02

7. On master node: Enter the root folder of Spark and start spark cluster.

  $ cd /usr/local/spark-1.1.0-bin-hadoop2.4

  $ sbin/start-all.sh

8. Open http://master:8080/ using your web browser to monitoring the cluster.

9. Run Spark examples:

    Locally:

    $ MASTER=local[4] $SPARK_HOME/bin/run-example SparkLR

    On cluster:

    $ MASTER=spark://master:7077 $SPARK_HOME/bin/run-example SparkLR

 

For any questions, feel free to contact me.  Email: wuzimian2006@163.com  QQ: 726590906

转载于:https://www.cnblogs.com/wzm-xu/p/4040462.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值