第一步:编译支持hive的spark assembly
http://blog.youkuaiyun.com/xiao_jun_0820/article/details/44178169
第二步:让cloudera manager装的spark支持hql
http://blog.youkuaiyun.com/xiao_jun_0820/article/details/44680925
发现CDH5.5竟然把spark-sql,sparkR命令文件都没有放。R文件夹也没有。
第三步:拷贝文件和设置环境
sparkR --master yarn --executor-memory 1g
提示找不到hadoop的配置
IllegalArgumentException: requirement failed: Cannot read Hadoop config dir /opt/cloudera/parcels/CDH/lib/spark/conf/yarn-conf.
vi /etc/profile
source /etc/profile
export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
export HADOOP_CONF_DIR=/etc/hadoop/conf
export HADOOP_CMD=/opt/cloudera/parcels/CDH/bin/hadoop
export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin:$SCALA_HOME/bin