Spark 学习笔记可以follow这里:https://github.com/MachineLP/Spark-
Word Count
Counting the number of occurances of words in a text is a popular first exercise using map-reduce.
The Task
Input: A text file consisisting of words separated by spaces.
Output: A list of words and their counts, sorted from the most to the least common.
We will use the book "Moby Dick" as our input.
#start the SparkContext
from pyspark import SparkContext
sc=SparkContext(master="local[4]")
# set import path
import sys
sys.path.append('../../Utils/')
#from plan2table import plan2table