在Scala中,通常有以下几种使用方式:
for (p <- e) e'
for (p <- e if g) e'
for (p <- e; p' <- e' ...) e''
以及相应的
for (p <- e) yield e'
for (p <- e if g) yield e'
for (p <- e; p' <- e' ...) yield e''
其中p,p'为Scala中的Pattern;e,e',e''为表达式;g为Boolean表达式。
根据《The Scala Language Specification Version 2.7》,上面的for表达式将在编译阶段展开为下面的形式(没有考虑p为比较复杂的Pattern时的情形):
for (p <- e) e' => e.foreach { case p => e' }
for (p <- e if g) e' => for (p <- e.filter{ (x1,...,xn) => g }) e' => ..
for (p <- e; p' <- e' ...) e'' => e.foreach{ case p => for (p' <- e' ...) e'' }
以及相应的
for (p <- e) yield e' => e.map { case p => e' }
for (p <- e if g) yield e' => for (p <- e.filter{ (x1,...,xn) => g }) yield e'
for (p <- e; p' <- e' ...) yield e'' => e.flatmap { case p => for (p' <- e' ...) yield e'' }
注意的是,这个转换发生在类型检查之前。也就是说,对map,filter,flatMap以及foreach这四个方法的方法签名没有任何其它限制,只需要满足展开后for语句的类型检查(个人觉得有点类似于C语言的宏展开)。了解了Scala编译器对for表达式的解析规则后, 我们可以自定义for表达式的含义。
这里要注意的是在Scala 2.8以前的版本中, for (p <- e if g) 和for (p <- e) { if (g)...}是有区别的。前者对e做了两次遍历,而后者只做一次。 虽然一般情况下会得到相同的结果,但在集合的规模较大时,会显现出明显的性能问题。例如求1~1000000中所有偶数之和:
def innerif(m: Int)={
val set = 1 until m
var sum = 0
for(num <- set; if (num%2 == 0)) sum += num
}
def outerif(m: Int)={
val set = 1 until m
var sum = 0
for (num <- set) { if (num % 2 == 0 ) sum += num }
}
def testMethod(n: Int)(m: Int)(body: (Int) =>Unit){
var avgMilliSec = 0.0
for(i <- 1 to n){
var start = System.currentTimeMillis();
body(m)
var end = System.currentTimeMillis();
val time = end - start;
avgMilliSec = 1.0* ((i-1)* avgMilliSec + time) / i;
}
println("avg time: "+avgMilliSec);
}
testMethod(10)(1000000)(innerif)
testMethod(10)(1000000)(outerif)
在scala 2.7.7 final REPL上的测试结果显示,innerif平均用时约142.2ms, outerif平均用时约15.7ms, 此外,当g与e'中同时包含一个变量v,并且在g中对变量v进行改动时,实际运行结果可能和我们所预想的不一致。看下面的例子:
def compress1[T](l: List[T]): List[T]={
var r = List(l.first);
for (x <- l; if (x != r.last)) r = r ::: List(x);
r
}
def compress2[T](l: List[T]): List[T]={
var r = List(l.first);
for (x <- l) if (x != r.last) r = r ::: List(x);
r
}
val cl = List('a, 'a, 'a, 'a, 'b, 'c, 'c, 'a, 'a, 'd, 'e, 'e, 'e, 'e)
compress1(cl)
compress2(cl)
这个例子的功能是将一个List中相邻重复的元素去掉,比如List('a, 'a, 'a, 'a, 'b, 'c, 'c, 'a, 'a, 'd, 'e, 'e, 'e, 'e),去掉相邻重复的元素后应为List('a,'b,'c,'a,'d','e), compress1过滤时使用的是同一个r的实例,也就是全部过滤好后再往后计算,而compress2是边过滤边计算,每次过滤时r 都可能不同。
Scala 2.7 对for ( p <- e ; if g)的这种解析显然违背了C或Java程序员的习惯,为此Scala 2.8做出了调整——让for(p <-e ; if g)在效果上等同于for (p <- e ) { if (g) ... } —— 迎合C或Java程序员的使用习惯。为什么说是效果上等同呢? 因为scala 2.8将其解析为 for ( p <- e.withFilter(...) ) ,这个withFilter函数和filter函数一样,也是定义在scala.collection.TraversableLike中,也是接收一个 (A) => Boolean类型的函数对象作为参数;不同是它并不创建一个新的符合过滤条件的元素所组成的集合, 而是返回一个 WithFilter类。 WithFilter类相当于原集合的一个代理类,其中的map,flatmap, foreach函数实现会将过滤函数应用于原集合的每个元素。这样就实现了for (p <- e ) { if (g) ... }的效果.有兴趣可以看看Martin Odersky在scala-lang上针对这个问题所发表的一个帖子。
scala.collection.TraversableLike是Scala集合框架的一个基础接口,里面定义了很多针对集合操作的基本方法,比如flatMap, map, foreach, filter, withFilter等。WithFilter定义在TraversableLike中,它继承自scala.generic.FilterMonadic. FilterMonadic中声明了四个函数foreach,map, flatMap, withFilter。
附:WithFilter类的源码
/** A class supporting filtered operations. Instances of this class are
* returned by method `withFilter`.
*/
class WithFilter(p: A => Boolean) extends FilterMonadic[A, Repr] {
/** Builds a new collection by applying a function to all elements of the
* outer $coll containing this `WithFilter` instance that satisfy predicate `p`.
*
* @param f the function to apply to each element.
* @tparam B the element type of the returned collection.
* @tparam That $thatinfo
* @param bf $bfinfo
* @return a new collection of type `That` resulting from applying
* the given function `f` to each element of the outer $coll
* that satisfies predicate `p` and collecting the results.
*
* @usecase def map[B](f: A => B): $Coll[B]
*
* @return a new $coll resulting from applying the given function
* `f` to each element of the outer $coll that satisfies
* predicate `p` and collecting the results.
*/
def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
for (x <- self)
if (p(x)) b += f(x)
b.result
}
/** Builds a new collection by applying a function to all elements of the
* outer $coll containing this `WithFilter` instance that satisfy
* predicate `p` and concatenating the results.
*
* @param f the function to apply to each element.
* @tparam B the element type of the returned collection.
* @tparam That $thatinfo
* @param bf $bfinfo
* @return a new collection of type `That` resulting from applying
* the given collection-valued function `f` to each element
* of the outer $coll that satisfies predicate `p` and
* concatenating the results.
*
* @usecase def flatMap[B](f: A => TraversableOnce[B]): $Coll[B]
*
* @return a new $coll resulting from applying the given collection-valued function
* `f` to each element of the outer $coll that satisfies predicate `p` and concatenating the results.
*/
def flatMap[B, That](f: A => GenTraversableOnce[B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
for (x <- self)
if (p(x)) b ++= f(x).seq
b.result
}
/** Applies a function `f` to all elements of the outer $coll containing
* this `WithFilter` instance that satisfy predicate `p`.
*
* @param f the function that is applied for its side-effect to every element.
* The result of function `f` is discarded.
*
* @tparam U the type parameter describing the result of function `f`.
* This result will always be ignored. Typically `U` is `Unit`,
* but this is not necessary.
*
* @usecase def foreach(f: A => Unit): Unit
*/
def foreach[U](f: A => U): Unit =
for (x <- self)
if (p(x)) f(x)
/** Further refines the filter for this $coll.
*
* @param q the predicate used to test elements.
* @return an object of class `WithFilter`, which supports
* `map`, `flatMap`, `foreach`, and `withFilter` operations.
* All these operations apply to those elements of this $coll which
* satisfy the predicate `q` in addition to the predicate `p`.
*/
def withFilter(q: A => Boolean): WithFilter =
new WithFilter(x => p(x) && q(x))
}