实验环境
编译操作系统:centos7
spark版本:2.3.0
intellj idea版本:2019.1
编译
编译主机上需要预先安装jdk,我已经安装了jdk8。
一.下载源代码包,下载地址:https://github.com/apache/spark/releases/tag/v2.3.0
二 .解压spark源代码包,并修改pom.xml
1.修改maven默认仓库地址,选择阿里云仓库可以加速依赖包的下载速度,阿里云仓库地址为http://maven.aliyun.com/nexus/content/groups/public
<repositories>
<repository>
<id>central</id>
<!-- This should be at top, it makes maven try the central repo first and then others and hence faster dep resolution -->
<name>Maven Repository</name>
<!--<url>https://repo.maven.apache.org/maven2</url>-->
<url>http://maven.aliyun.com/nexus/content/groups/public</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>central</id>
<!--<url>https://repo.maven.apache.org/maven2</url>-->
<url>http://maven.aliyun.com/nexus/content/groups/public</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</pluginRepository>
</pluginRepositories>
2.修改maven内存环境变量。
_COMPILE_JVM_OPTS="-Xmx4g -Xms4g -XX:ReservedCodeCacheSize=1024m"
3.编译
# 最简单的编译
./build/mvn -DskipTests clean package
# 支持hadoop,yarn和hive
./build/mvn -Pyarn -Phadoop-2.7 -Phive-thriftserver -DskipTests clean package
编译完成后
[INFO] Spark Project Parent POM ....................

最低0.47元/天 解锁文章
592






