我的环境:
问题1、
scala> val textFile =sc.textFile("README.md")
报错信息为: error: not found: value sc
sc为spark context,创建rdd时候就有的了,很明显创建失败
可能原因:
hdfs失败,未启动hadoop;加载spark-shell就失败;权限不对;环境变量没配好==》路径不对
附注:Spark context available as sc.
Spark context的缩写为Sc。
问题2、
spark.sparkcontext:error initializiing
报错总信息为:sparkDriver could not bind on port 0
具体摘要为:
starting remoting
java.net.BindException:Failed to bind to: /10.1.4.221:0:shutting down Netty transport
Service 'sparkDriver' failed after 16 retries!
Master上报错,slave2上未报这个错误
Slave2上的部分信息为:
Successfully started service ‘sparkDriver’ on port 42887
Remoting started listening on addresses: [akka.tcp://sparkDriver@10.1.4.237:42887]
*上面ip地址为ip地址
都报的错误为
error not found: value sqlContext
解决办法:
export SPARK_LOCAL_IP=127.0.0.1
注意export 之后,仅对当前窗口有效,仅仅是临时创建环境变量,最好加到$SPARK_HOME/conf/spark-env.sh中
猜测报错原因:前几天网线动了,所以ip变了?
解决上述问题的参考链接:
http://stackoverflow.com/questions/30085779/apache-spark-error-while-start
然后只报error not found: value sql Context的错误了
google后的相关信息为:
Looks like your Spark config may be trying to log to an HDFS path. Can you review yourconfig settings?
While reading a local file which is not in HDFS throughspark shell, does the HDFS need to be up and running ?
The data may be spilled off to disk hence HDFS is anecessity for Spark.
You can run Spark on a single machine & not use HDFS but in distributed mode HDFS will be required.
所以问题原因应该是:
这个应该是没有启动hadoop
**突然忘了断电后重启过了
于是
bin/hadoopnamenode –format
sbin/start-dfs.sh
sbin/start-yarn.sh
然后再启动spark
进入spark目录,sbin/start-all.sh
然后再bin/spark-shell即成功