///
1.安装maven,具体安装方法自行百度
注意maven的版本
The Maven-based build is the build of reference for Apache Spark. Building Spark using Maven requires Maven 3.3.9 or newer and Java 7+. Note that support for Java 7 is deprecated as of Spark 2.0.0 and may be removed in Spark 2.2.0.
2.设置maven可使用的最大内存
export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m"
3.上传spark源码,并解压
tar zxvf spark-2.1.1.tgz
4.进入源码目录执行以下命令编译
./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Dscala-2.11.8 -Phive -Phive-thriftserver -DskipTests clean package
./make-distribution.sh --tgz -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver
Total time: 01:23 h
4切换到编译完的dev目录下执行下面命令打包
修改make-distribution文件
VERSION=2.1.1
SCALA_VERSION=2.11.8
SPARK_HADOOP_VERSION=2.6.0-cdh5.10.0
SPARK_HIVE=1
执行下面打包命令
./dev/make-distribution.sh --name custom-spark --tgz -Psparkr -Phadoop-2.6 -Phive -Phive-thriftserver -Pyarn
如果报错无法找到R_HOME则需要安装R或者修改make-distribution文件把R模块参数改为false
yum install epel-release
yum -y install R
1.安装maven,具体安装方法自行百度
注意maven的版本
The Maven-based build is the build of reference for Apache Spark. Building Spark using Maven requires Maven 3.3.9 or newer and Java 7+. Note that support for Java 7 is deprecated as of Spark 2.0.0 and may be removed in Spark 2.2.0.
2.设置maven可使用的最大内存
export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m"
3.上传spark源码,并解压
tar zxvf spark-2.1.1.tgz
4.进入源码目录执行以下命令编译
./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Dscala-2.11.8 -Phive -Phive-thriftserver -DskipTests clean package
./make-distribution.sh --tgz -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver
Total time: 01:23 h
4切换到编译完的dev目录下执行下面命令打包
修改make-distribution文件
VERSION=2.1.1
SCALA_VERSION=2.11.8
SPARK_HADOOP_VERSION=2.6.0-cdh5.10.0
SPARK_HIVE=1
执行下面打包命令
./dev/make-distribution.sh --name custom-spark --tgz -Psparkr -Phadoop-2.6 -Phive -Phive-thriftserver -Pyarn
如果报错无法找到R_HOME则需要安装R或者修改make-distribution文件把R模块参数改为false
yum install epel-release
yum -y install R