基本数据结构
Scala提供了一些不错的集合。
参考 Effective Scala 对怎样使用 集合的观点。
列表 List
scala> val numbers = List(1, 2, 3, 4) numbers: List[Int] = List(1, 2, 3, 4)
集 Set
集没有重复
scala> Set(1, 1, 2) res0: scala.collection.immutable.Set[Int] = Set(1, 2)
元组 Tuple
元组是在不使用类的前提下,将元素组合起来形成简单的逻辑集合。
scala> val hostPort = ("localhost", 80) hostPort: (String, Int) = (localhost, 80)
与样本类不同,元组不能通过名称获取字段,而是使用位置下标来读取对象;而且这个下标基于1,而不是基于0。
scala> hostPort._1 res0: String = localhost scala> hostPort._2 res1: Int = 80
元组可以很好得与模式匹配相结合。
hostPort match { case ("localhost", port) => ... case (host, port) => ... }
在创建两个元素的元组时,可以使用特殊语法:->
scala> 1 -> 2 res0: (Int, Int) = (1,2)
参考 Effective Scala 对 解构绑定 (“拆解”一个元组)的观点。
映射 Map
它可以持有基本数据类型。
Map(1 -> 2) Map("foo" -> "bar")
这看起来像是特殊的语法,不过不要忘了上文讨论的->
可以用来创建二元组。
Map()方法也使用了从第一节课学到的变参列表:Map(1 -> "one", 2 -> "two")
将变为 Map((1, "one"), (2, "two"))
,其中第一个参数是映射的键,第二个参数是映射的值。
映射的值可以是映射甚或是函数。
Map(1 -> Map("foo" -> "bar"))
Map("timesTwo" -> { timesTwo(_) })
选项 Option
Option
是一个表示有可能包含值的容器。
Option基本的接口是这样的:
trait Option[T] { def isDefined: Boolean def get: T def getOrElse(t: T): T }
Option本身是泛型的,并且有两个子类: Some[T]
或 None
我们看一个使用Option的例子:
Map.get
使用 Option
作为其返回值,表示这个方法也许不会返回你请求的值。
scala> val numbers = Map("one" -> 1, "two" -> 2) numbers: scala.collection.immutable.Map[java.lang.String,Int] = Map(one -> 1, two -> 2) scala> numbers.get("two") res0: Option[Int] = Some(2) scala> numbers.get("three") res1: Option[Int] = None
现在我们的数据似乎陷在Option
中了,我们怎样获取这个数据呢?
直觉上想到的可能是在isDefined
方法上使用条件判断来处理。
// We want to multiply the number by two, otherwise return 0. val result = if (res1.isDefined) { res1.get * 2 } else { 0 }
我们建议使用getOrElse
或模式匹配处理这个结果。
getOrElse
让你轻松地定义一个默认值。
val result = res1.getOrElse(0) * 2
模式匹配能自然地配合Option
使用。
val result = res1 match { case Some(n) => n * 2 case None => 0 }
参考 Effective Scala 对使用Options的意见。
函数组合子(Functional Combinators)
List(1, 2, 3) map squared
对列表中的每一个元素都应用了squared
平方函数,并返回一个新的列表List(1, 4, 9)
。我们称这个操作map
组合子。 (如果想要更好的定义,你可能会喜欢Stackoverflow上对组合子的说明。)他们常被用在标准的数据结构上。
map
map
对列表中的每个元素应用一个函数,返回应用后的元素所组成的列表。
scala> numbers.map((i: Int) => i * 2) res0: List[Int] = List(2, 4, 6, 8)
或传入一个部分应用函数
scala> def timesTwo(i: Int): Int = i * 2 timesTwo: (i: Int)Int scala> numbers.map(timesTwo _) res0: List[Int] = List(2, 4, 6, 8)
foreach
foreach
很像map,但没有返回值。foreach仅用于有副作用[side-effects]的函数。
scala> numbers.foreach((i: Int) => i * 2)
什么也没有返回。
你可以尝试存储返回值,但它会是Unit类型(即void)
scala> val doubled = numbers.foreach((i: Int) => i * 2) doubled: Unit = ()
filter
filter
移除任何对传入函数计算结果为false的元素。返回一个布尔值的函数通常被称为谓词函数[或判定函数]。
scala> numbers.filter((i: Int) => i % 2 == 0) res0: List[Int] = List(2, 4)
scala> def isEven(i: Int): Boolean = i % 2 == 0 isEven: (i: Int)Boolean scala> numbers.filter(isEven _) res2: List[Int] = List(2, 4)
zip
zip
将两个列表的内容聚合到一个对偶列表中。
scala> List(1, 2, 3).zip(List("a", "b", "c")) res0: List[(Int, String)] = List((1,a), (2,b), (3,c))
partition
partition
将使用给定的谓词函数分割列表。
scala> val numbers = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) scala> numbers.partition(_ % 2 == 0) res0: (List[Int], List[Int]) = (List(2, 4, 6, 8, 10),List(1, 3, 5, 7, 9))
find
find
返回集合中第一个匹配谓词函数的元素。
scala> numbers.find((i: Int) => i > 5) res0: Option[Int] = Some(6)
drop & dropWhile
drop
将删除前i个元素
scala> numbers.drop(5) res0: List[Int] = List(6, 7, 8, 9, 10)
dropWhile
将删除元素直到找到第一个匹配谓词函数的元素。例如,如果我们在numbers列表上使用dropWhile
奇数的函数, 1
将被丢弃(但3
不会被丢弃,因为他被2
“保护”了)。
scala> numbers.dropWhile(_ % 2 != 0) res0: List[Int] = List(2, 3, 4, 5, 6, 7, 8, 9, 10)
foldLeft
scala> numbers.foldLeft(0)((m: Int, n: Int) => m + n) res0: Int = 55
0为初始值(记住numbers是List[Int]类型),m作为一个累加器。
直接观察运行过程:
scala> numbers.foldLeft(0) { (m: Int, n: Int) => println("m: " + m + " n: " + n); m + n } m: 0 n: 1 m: 1 n: 2 m: 3 n: 3 m: 6 n: 4 m: 10 n: 5 m: 15 n: 6 m: 21 n: 7 m: 28 n: 8 m: 36 n: 9 m: 45 n: 10 res0: Int = 55
foldRight
和foldLeft一样,只是运行过程相反。
scala> numbers.foldRight(0) { (m: Int, n: Int) => println("m: " + m + " n: " + n); m + n } m: 10 n: 0 m: 9 n: 10 m: 8 n: 19 m: 7 n: 27 m: 6 n: 34 m: 5 n: 40 m: 4 n: 45 m: 3 n: 49 m: 2 n: 52 m: 1 n: 54 res0: Int = 55
flatten
flatten
将嵌套结构扁平化为一个层次的集合。
scala> List(List(1, 2), List(3, 4)).flatten res0: List[Int] = List(1, 2, 3, 4)
flatMap
flatMap
是一种常用的组合子,结合映射[mapping]和扁平化[flattening]。 flatMap需要一个处理嵌套列表的函数,然后将结果串连起来。
scala> val nestedNumbers = List(List(1, 2), List(3, 4)) nestedNumbers: List[List[Int]] = List(List(1, 2), List(3, 4)) scala> nestedNumbers.flatMap(x => x.map(_ * 2)) res0: List[Int] = List(2, 4, 6, 8)
可以把它看做是“先映射后扁平化”的快捷操作:
scala> nestedNumbers.map((x: List[Int]) => x.map(_ * 2)).flatten res1: List[Int] = List(2, 4, 6, 8)
这个例子先调用map,然后可以马上调用flatten,这就是“组合子”的特征,也是这些函数的本质。
参考 Effective Scala 对flatMap的意见。
扩展函数组合子
现在我们已经学过集合上的一些函数。
我们将尝试写自己的函数组合子。
有趣的是,上面所展示的每一个函数组合子都可以用fold方法实现。让我们看一些例子。
def ourMap(numbers: List[Int], fn: Int => Int): List[Int] = { numbers.foldRight(List[Int]()) { (x: Int, xs: List[Int]) => fn(x) :: xs } } scala> ourMap(numbers, timesTwo(_)) res0: List[Int] = List(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)
为什么是List[Int]()
?Scala没有聪明到理解你的目的是将结果积聚在一个空的Int类型的列表中。
Map?
所有展示的函数组合子都可以在Map上使用。Map可以被看作是一个二元组的列表,所以你写的函数要处理一个键和值的二元组。
scala> val extensions = Map("steve" -> 100, "bob" -> 101, "joe" -> 201) extensions: scala.collection.immutable.Map[String,Int] = Map((steve,100), (bob,101), (joe,201))
现在筛选出电话分机号码低于200的条目。
scala> extensions.filter((namePhone: (String, Int)) => namePhone._2 < 200) res0: scala.collection.immutable.Map[String,Int] = Map((steve,100), (bob,101))
因为参数是元组,所以你必须使用位置获取器来读取它们的键和值。呃!
幸运的是,我们其实可以使用模式匹配更优雅地提取键和值。
scala> extensions.filter({case (name, extension) => extension < 200}) res0: scala.collection.immutable.Map[String,Int] = Map((steve,100), (bob,101))
ap works by applying a function to each element in the list.
scala> val l = List(1,2,3,4,5) scala> l.map( x => x*2 ) res60: List[Int] = List(2, 4, 6, 8, 10)
So there are some occasions where you want to return a sequence or list from the function, for example an Option
scala> def f(x: Int) = if (x > 2) Some(x) else None scala> l.map(x => f(x)) res63: List[Option[Int]] = List(None, None, Some(3), Some(4), Some(5))
flatMap works applying a function that returns a sequence for each element in the list, and flattening the results into the original list. This is easier to show than to explain:
scala> def g(v:Int) = List(v-1, v, v+1) g: (v: Int)List[Int] scala> l.map(x => g(x)) res64: List[List[Int]] = List(List(0, 1, 2), List(1, 2, 3), List(2, 3, 4), List(3, 4, 5), List(4, 5, 6)) scala> l.flatMap(x => g(x)) res65: List[Int] = List(0, 1, 2, 1, 2, 3, 2, 3, 4, 3, 4, 5, 4, 5, 6)
This comes in really useful with the built in Option class because an option can be considered a sequence that is either empty or has 1 item.
scala> l.map(x => f(x)) res66: List[Option[Int]] = List(None, None, Some(3), Some(4), Some(5)) scala> l.flatMap(x => f(x)) res67: List[Int] = List(3, 4, 5)
So with that all covered, lets look at how you can apply those concepts to a Map. Now a map can be implemented a number of different ways, but regardless of how it is implemented it can be thought of as a sequence of Tuples, where a tuple is a pair of items, the key and the value.
scala> val m = Map(1 -> 2, 2 -> 4, 3 -> 6) m: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 2 -> 4, 3 -> 6) scala> m.toList res69: List[(Int, Int)] = List((1,2), (2,4), (3,6))
We can access a tuple by accessing the inner variables _1 and _2
scala> val t = (1,2) t: (Int, Int) = (1,2) scala> t._1 res70: Int = 1 scala> t._2 res71: Int = 2
So we want to think about using map and flatMap on our Map, but because of the way a map works it often doesn’t make quite the same sense, we probably don’t want to apply a function to the tuple, but to the value side of the tuple, leaving the key as is, so for example we might want to double all the values. Map provides us with a function to do exactly that.
scala> m.mapValues(v => v*2) res73: scala.collection.immutable.Map[Int,Int] = Map(1 -> 4, 2 -> 8, 3 -> 12) scala> m.mapValues(v => f(v)) res74: scala.collection.immutable.Map[Int,Option[Int]] = Map(1 -> None, 2 -> Some(4), 3 -> Some(6))
But in my case I wanted to do something more like flat map in this case, I want a map to come out that misses out the key 1 because it’s value is None. flatMap doesn’t work on maps like mapValues, it get’s passed the tuple and if it returns a List single items you’ll get a list back, but if you return a tuple you’ll get a Map back.
scala> m.flatMap(e => List(e._2)) res85: scala.collection.immutable.Iterable[Int] = List(2, 4, 6) scala> m.flatMap(e => List(e)) res86: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 2 -> 4, 3 -> 6)
Ok so we are pretty close to using options with flatMap, we need to filter out our None’s, we can do returning a list with just e => f(e._2) and we’ll get the list of values without the None’s, but that isn’t really what I want. What I need to do is return an Option containing a tuple. So here’s our updated function:
scala> def h(k:Int, v:Int) = if (v > 2) Some(k->v) else None h: (k: Int, v: Int)Option[(Int, Int)]
and here’s how we might call it:
scala> m.flatMap ( e => h(e._1,e._2) ) res109: scala.collection.immutable.Map[Int,Int] = Map(2 -> 4, 3 -> 6)
but this is pretty ugly, all those _1 and _2’s make me sad. If only there was a nice way of unapplying the tuple into variables. Given that this works in python and in a number of places in scala I thought this code should work:
scala> m.flatMap ( (k,v) => h(k,v) ) :10: error: wrong number of parameters; expected = 1
I spent way too long today looking at this (in 5 minute chunks broken by meetings to be fair), before I gave in and asked a coworker what the hell I was missing. The answer is seems is that an unapply is normally only executed in a PartialFunction, which in scala is most easily defined as a case statement. So this is the code that works as expected:
scala> m.flatMap { case (k,v) => h(k,v) } res108: scala.collection.immutable.Map[Int,Int] = Map(2 -> 4, 3 -> 6)
Note that we switch to using curly braces, indicating a function block rather than parameters, and the function is a case statement. This means that the function block we pass to flatMap is a partialFunction that is only invoked for items that match the case statement, and in the case statement the unapply method on tuple is called to extract the contents of the tuple into the variables. This form of variable extraction is very common, and you’ll see it used a lot.
There is of course another way of writing that code that doesn’t use flatMap. Since what we are doing is removing all members of the map that don’t match a predicate, this is a use for the filter method:
scala> m.filter( e => f(e._2) != None ) res114: scala.collection.immutable.Map[Int,Int] = Map(2 -> 4, 3 -> 6) scala> m.filter { case (k,v) => f(v) != None } res115: scala.collection.immutable.Map[Int,Int] = Map(2 -> 4, 3 -> 6) scala> m.filter { case (k,v) => f(v).isDefined } res116: scala.collection.immutable.Map[Int,Int] = Map(2 -> 4, 3 -> 6)