sparkR

SparkR主要由两部分组成:SparkR包和JVM后端。SparkR包是一个R扩展包,安装到R中之后,在R的运行时环境里提供了RDD和DataFrame API。
SparkR的安装:

1. SparkR的代码下载

从网页下载代码SparkR-pkg-master.zip https://github.com/amplab-extras/SparkR-pkg
2.SparkR的代码编译

1) 解压SparkR-pkg-master.zip,然后cd SparkR-pkg-master/

2) 编译的时候需要指明Hadoop版本和Spark版本

SPARK_HADOOP_VERSION=2.4.1 SPARK_VERSION=1.2.0 ./install-dev.sh

至此,单机版的SparkR已经安装完成
3. 分布式SparkR的部署配置
1) 编译成功后,会生成一个lib文件夹,进入lib文件夹,打包SparkR为SparkR.tar.gz,这个是分布式SparkR部署的关键。

2) 由打包好的SparkR.tar.gz在各集群节点上安装SparkR

R CMD INSTALL SparkR.tar.gz

至此分布式SparkR搭建完成。

[root@hdp-gp-dk05 /mnt/mydisk/R/SparkR-pkg-master/lib]#cd ..
[root@hdp-gp-dk05 /mnt/mydisk/R/SparkR-pkg-master]#SPARK_HADOOP_VERSION=2.7.1 SPARK_VERSION=1.6.0 ./install-dev.sh
WARNING: ignoring environment value of R_HOME
* installing *source* package ?.parkR?....
** libs
** arch - 
./sbt/sbt assembly
Launching sbt from sbt/sbt-launch-0.13.6.jar
Getting org.scala-sbt sbt 0.13.6 ...
downloading https://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/compiler-interface/0.13.6/jars/compiler-interface-bin.jar ...
    [SUCCESSFUL ] org.scala-sbt#compiler-interface;0.13.6!compiler-interface-bin.jar (14737ms)
downloading https://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/compiler-interface/0.13.6/jars/compiler-interface-src.jar ...


[warn] Strategy 'discard' was applied to 2 files
[warn] Strategy 'first' was applied to 254 files
[info] Checking every *.class/*.jar file's SHA-1.
[info] SHA-1: 64af5bea64a5864e45900c33072b932ae6b0dc9f
[info] Packaging /mnt/mydisk/R/SparkR-pkg-master/pkg/src/target/scala-2.10/sparkr-assembly-0.1.jar ...
[info] Done packaging.
[success] Total time: 1107 s, completed Nov 18, 2016 10:50:03 AM
cp -f target/scala-2.10/sparkr-assembly-0.1.jar ../inst/
R CMD SHLIB -o SparkR.so string_hash_code.c
make[1]: Entering directory `/mnt/mydisk/R/SparkR-pkg-master/pkg/src'
gcc -std=gnu99 -I/usr/local/lib64/R/include -DNDEBUG  -I/usr/local/include    -fpic  -g -O2  -c string_hash_code.c -o string_hash_code.o
gcc -std=gnu99 -shared -L/usr/local/lib64 -o SparkR.so string_hash_code.o
make[1]: Leaving directory `/mnt/mydisk/R/SparkR-pkg-master/pkg/src'
installing to /mnt/mydisk/R/SparkR-pkg-master/lib/SparkR/libs
** R
** inst
** preparing package for lazy loading
Creating a generic function for ?.apply?.from package ?.ase?.in package ?.parkR?
Creating a generic function for ?.ilter?.from package ?.ase?.in package ?.parkR?
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (SparkR)
[root@hdp-gp-dk05 /mnt/mydisk/R]#ls
R-3.1.1  R-3.1.1.tar.gz  SparkR-pkg-master  SparkR-pkg-master.zip
[root@hdp-gp-dk05 /mnt/mydisk/R]#ls
R-3.1.1  R-3.1.1.tar.gz  SparkR-pkg-master  SparkR-pkg-master.zip  SparkR.tar.gz
[root@hdp-gp-dk05 /mnt/mydisk/R]#R CMD INSTALL SparkR.tar.gz
WARNING: ignoring environment value of R_HOME
* installing to library ?.usr/local/lib64/R/library?
* installing *binary* package ?.parkR?....
* DONE (SparkR)
[root@hdp-gp-dk05 /mnt/mydisk/R]#R
WARNING: ignoring environment value of R_HOME

R version 3.1.1 (2014-07-10) -- "Sock it to Me"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(SparkR)
[SparkR] Initializing with classpath /usr/local/lib64/R/library/SparkR/sparkr-assembly-0.1.jar

> 
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值