SBT-Scala
参考上篇http://blog.youkuaiyun.com/baifanwudi/article/details/78354339
配置hadoop环境变量
新建SBT项目
sbt配置
name := "SparkScalaTest"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.2.0"
libraryDependencies += "org.apache.spark" % "spark-sql_2.11" % "2.2.0"
libraryDependencies += "org.apache.spark" % "spark-sql-kafka-0-10_2.11" % "2.2.0"
libraryDependencies += "org.apache.spark" % "spark-streaming-kafka-0-10_2.11" % "2.2.0"
libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "2.7.1"
测试代码
import org.apache.spark.sql.SparkSession
object Test {
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder().appName("LocalTest").master("local[2]").getOrCreate();
val textFile = spark.read.textFile("README.md");
println(textFile.count());
println(textFile.first());
val linesWithSpark = textFile.filter(line => line.contains("Spark"))
val s = textFile.filter(line => line.contains("Spark")).count()
println(s)
spark.stop()
}
}
本地启动测试.
Maven-Scala
如果是maven项目,需要加上scala版本包依赖(接上篇maven项目)
新建scala目录,且标记为Source Root
然后新建scala测试类,测试代码如上面一样.