1.先安装好java sdk,配置好java环境变量
2.安装spark scala
可以直接brew安装,也可下载安装
$ brew update
$ brew info apache-spark
$ brew install apache-spark
$ brew install scala
$ brew install sbt
scala下载地址http://www.scala-lang.org/download/
3.安装intellij idea
下载地址http://www.jetbrains.com/idea/download;社区版community免费,个人的话就下载这个
安装好之后打开idea,安装scala插件和sbt插件;preferences--》plugins--》browse repositories
搜索scala后点击install安装
4.建立工程
(1)file-->new project 选择scala sbt
点击File-Project Structure-Libraries,点击+号-scala SDK,添加scala sdk
编辑build.sbt文件,添加Spark依赖,编译的时候缺少的库可以在这里添加
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.0.0"
Scala版本对应的可在Spark官网查询;下载较慢,最好有翻墙。有时下载中断再触发,需要使用sbt命令package
(2)使用maven创建工程
默认创建的是java工程,那么要将src下的java目录改为scala目录名,并且在工程setting里面将scala目录改为src目录。
然后就是修改pom xml文件了,
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.wl</groupId> <artifactId>sparktest</artifactId> <version>1.0-SNAPSHOT</version> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <spark.version>2.2.0</spark.version> <scala.version>2.11</scala.version> <hadoop.version>2.7.3</hadoop.version> </properties> <dependencies> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_${scala.version}</artifactId> <version>${spark.version}</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-hive_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>2.7.3</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-mllib_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>5.1.39</version> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> </dependency> </dependencies> <repositories> <repository> <id>central</id> <name>Maven Repository Switchboard</name> <layout>default</layout> <url>http://repo2.maven.org/maven2</url> <snapshots> <enabled>false</enabled> </snapshots> </repository> </repositories> <build> <sourceDirectory>src/main/scala</sourceDirectory> <testSourceDirectory>src/test/scala</testSourceDirectory> <plugins> <plugin> <!-- MAVEN 编译使用的JDK版本 --> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.5</version> <configuration> <source>1.8</source> <target>1.8</target> <encoding>UTF-8</encoding> </configuration> </plugin> </plugins> </build> </project>