Scala : Map的过滤

本文介绍了如何使用Scala语言中的保留(retain)方法和过滤(filter)方法来筛选和修改Map中的元素,包括使用匿名函数和自定义函数进行条件判断的具体操作。

       问题:过滤映射中的元素,可以直接修改可变映射,也可以在不可变映射中应用过滤算法获得一个新的映射

       解决方案:Use the retain method to define the elements to retain when using a mutable map, and use filterKeys or filter to filter the elements in a mutable or immutable map, remembering to assign the result to a new variable.

       You can filter the elements in a mutable Map  using the retain method to specify which elements should be retained.

scala> var x = collection.mutable.Map(1 -> "a", 2 -> "b", 3 -> "c")
x: scala.collection.mutable.Map[Int,String] = Map(2 -> b, 1 -> a, 3 -> c)

scala> x.retain((k,v) => k > 1)
res0: scala.collection.mutable.Map[Int,String] = Map(2 -> b, 3 -> c)

scala> x
res1: scala.collection.mutable.Map[Int,String] = Map(2 -> b, 3 -> c)

        As shown, retain modifies a mutable map in place. As implied by the anonymous function signature used in that example:

(k,v) => ...
     

        Your algorithm can test both the key and value of each element to decide which elements to retain in the map.

        In a related note, the transform method doesn’t filter a map, but it lets you transform the elements in a mutable map:


scala> x.transform((k,v) => v.toUpperCase)
res0: scala.collection.mutable.Map[Int,String] = Map(2 -> B, 3 -> C)

scala> x
res1: scala.collection.mutable.Map[Int,String] = Map(2 -> B, 3 -> C)

     Depending on your definition of “filter,” you can also remove elements from a map using methods likeremove andclear.

    

Mutable and immutable maps

      When working with a mutable or immutable map, you can use a predicate with the filterKeys methods to define which map elements to retain. When using this method, remember to assign the filtered result to a new variable:

scala> val x = Map(1 -> "a", 2 -> "b", 3 -> "c")
x: scala.collection.mutable.Map[Int,String] = Map(2 -> b, 1 -> a, 3 -> c)

scala> val y = x.filterKeys(_ > 2)
y: scala.collection.Map[Int,String] = Map(3 -> c)
   

   

       The predicate you supply should return true for the elements you want to keep in the new collection and false for the elements you don’t want.

        If your algorithm is longer, you can define a function (or method), and then use it in the filterKeys call, rather than using an anonymous function. First define your method, such as this method, which returns true when the value the method is given is 1:

      
scala> def only1(i: Int) = if (i == 1) true else false
only1: (i: Int)Boolean

      

         Then pass the method to the filterKeys method:


scala> val x = Map(1 -> "a", 2 -> "b", 3 -> "c")
x: scala.collection.mutable.Map[Int,String] = Map(2 -> b, 1 -> a, 3 -> c)

scala> val y = x.filterKeys(only1)
y: scala.collection.Map[Int,String] = Map(1 -> a)

       

          In an interesting use, you can also use a  Set with filterKeys to define the elements to retain:


scala> var m = Map(1 -> "a", 2 -> "b", 3 -> "c")
m: scala.collection.immutable.Map[Int,String] = Map(1 -> a, 2 -> b, 3 -> c)

scala> val newMap = m.filterKeys(Set(2,3))
newMap: scala.collection.immutable.Map[Int,String] = Map(2 -> b, 3 -> c)


             For instance, the Map version of the filter method lets you filter the map elements by either key, value, or both. The filter method provides your predicate a Tuple2, so you can access the key and value as shown in these examples:

            

scala> var m = Map(1 -> "a", 2 -> "b", 3 -> "c")
m: scala.collection.immutable.Map[Int,String] = Map(1 -> a, 2 -> b, 3 -> c)

// access the key
scala> m.filter((t) => t._1 > 1)
res0: scala.collection.immutable.Map[Int,String] = Map(2 -> b, 3 -> c)

// access the value
scala> m.filter((t) => t._2 == "c")
res1: scala.collection.immutable.Map[Int,String] = Map(3 -> c)

            可以自定义一个方法fx(),将Tuple2传过去

var xxx = x.filter{case(k,v)=>fx(k,v)}


           The take method lets you “take” (keep) the first N elements from the map:

scala> m.take(2)
res2: scala.collection.immutable.Map[Int,String] = Map(1 -> a, 2 -> b)



在华为云执行spark任务报错 ERROR | [Driver] | User class threw exception: org.apache.spark.SparkException: Job 87 cancelled because SparkContext was shut down | org.apache.spark.deploy.yarn.ApplicationMaster.logError(Logging.scala:94) org.apache.spark.SparkException: Job 87 cancelled because SparkContext was shut down at org.apache.spark.scheduler.DAGScheduler.$anonfun$cleanUpAfterSchedulerStop$1(DAGScheduler.scala:1150) at org.apache.spark.scheduler.DAGScheduler.$anonfun$cleanUpAfterSchedulerStop$1$adapted(DAGScheduler.scala:1148) at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:1148) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onStop(DAGScheduler.scala:2589) at org.apache.spark.util.EventLoop.stop(EventLoop.scala:84) at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:2489) at org.apache.spark.SparkContext.$anonfun$stop$12(SparkContext.scala:2214) at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1584) at org.apache.spark.SparkContext.stop(SparkContext.scala:2214) at org.apache.spark.SparkContext.$anonfun$new$38(SparkContext.scala:667) at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:248) at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:222) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2159) at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:222) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at scala.util.Try$.apply(Try.scala:213) at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:222) at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:212) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:931) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2341) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2362) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2381) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:472) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:425) at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:47) at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3718) at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2737) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3709) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:111) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:173) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:94) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:781) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3707) at org.apache.spark.sql.Dataset.head(Dataset.scala:2737) at org.apache.spark.sql.Dataset.take(Dataset.scala:2944) at org.apache.spark.sql.Dataset.getRows(Dataset.scala:306) at org.apache.spark.sql.Dataset.showString(Dataset.scala:343) at org.apache.spark.sql.Dataset.show(Dataset.scala:832) at org.apache.spark.sql.Dataset.show(Dataset.scala:809) at com.bmsoft.operate.VsZConductorOverlimiteventPeriod.ConductorconductorOverlimiteventInfo(VsZConductorOverlimiteventPeriod.scala:589) at com.bmsoft.operate.VsZConductorOverlimiteventPeriod.conductorOverlimiteventInfo(VsZConductorOverlimiteventPeriod.scala:59) at com.bmsoft.task.VsZConductorOverlimiteventPeriodTask$.func(VsZConductorOverlimiteventPeriodTask.scala:24) at com.bmsoft.task.VsZConductorOverlimiteventPeriodTask$.$anonfun$main$1(VsZConductorOverlimiteventPeriodTask.scala:28) at com.bmsoft.task.VsZConductorOverlimiteventPeriodTask$.$anonfun$main$1$adapted(VsZConductorOverlimiteventPeriodTask.scala:28) at com.bmsoft.scala.utils.LeoUtils.package$.$anonfun$taskEntry_daYu$3(LeoUtils.scala:437) at scala.runtime.java8.JFunction1$mcVJ$sp.apply(JFunction1$mcVJ$sp.java:23) at scala.collection.immutable.NumericRange.foreach(NumericRange.scala:75) at com.bmsoft.scala.utils.LeoUtils.package$.taskEntry_daYu(LeoUtils.scala:424) at com.bmsoft.task.VsZConductorOverlimiteventPeriodTask$.main(VsZConductorOverlimiteventPeriodTask.scala:28) at com.bmsoft.task.VsZConductorOverlimiteventPeriodTask.main(VsZConductorOverlimiteventPeriodTask.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:733)
最新发布
08-22
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值