1、取单词出现次数最多的前三个单词(普通列表)
object Main extends App {
val list1 = List("hello scala","hello cindy","hello alice","scala cindy","scala alice","hello")
val list2 = list1.flatMap(_.split(" "))
println(list2)
val list3 = list2.groupBy(i => i)
println(list3)
val list4 = list3.map(i => (i._1,i._2.length))
println(list4)
val list5 = list4.toList.sortBy(_._2)
println(list5)
println(list5.takeRight(3))
}
List(hello, scala, hello, cindy, hello, alice, scala, cindy, scala, alice, hello)
Map(alice -> List(alice, alice), scala -> List(scala, scala, scala), cindy -> List(cindy, cindy), hello -> List(hello, hello, hello, hello))
Map(alice -> 2, scala -> 3, cindy -> 2, hello -> 4)
List((alice,2), (cindy,2), (scala,3), (hello,4))
List((cindy,2), (scala,3), (hello,4))
2、取单词出现次数最多的前三个单词(元组类列表)
法1:先拆分成普通列表,再分组统计单词个数,然后排序
object Main extends App {
val list1 = List(("hello scala",1),("hello cindy",2),("hello alice",1),("scala cindy",2),("scala alice",3),("hello",1))
println(list1)
val list2 = list1.map(kv => (kv._1.trim+" ")*kv._2)
println(list2)
val list3 = list2.flatten(i => i.split(" "))
println(list3)
val list4 = list3.groupBy(i => i)
println(list4)
val list5 = list4.toList.map(kv => (kv._1,kv._2.length))
println(list5)
val list6 = list5.sortBy(-_._2)
println(list6)
}
List((hello scala,1), (hello cindy,2), (hello alice,1), (scala cindy,2), (scala alice,3), (hello,1))
List(hello scala , hello cindy hello cindy , hello alice , scala cindy scala cindy , scala alice scala alice scala alice , hello )
List(hello, scala, hello, cindy, hello, cindy, hello, alice, scala, cindy, scala, cindy, scala, alice, scala, alice, scala, alice, hello)
Map(alice -> List(alice, alice, alice, alice), scala -> List(scala, scala, scala, scala, scala, scala), cindy -> List(cindy, cindy, cindy, cindy), hello -> List(hello, hello, hello, hello, hello))
List((alice,4), (scala,6), (cindy,4), (hello,5))
List((scala,6), (hello,5), (alice,4), (cindy,4))
法2:先扁平化映射为单个单词的元组组成的List,然后分组统计单词个数,排序输出
object Main extends App {
val list1 = List(("hello scala",1),("hello cindy",2),("hello alice",1),("scala cindy",2),("scala alice",3),("hello",1))
println(list1)
//将list1的每个元组进行处理,然后扁平化输出
val list2:List[(String,Int)] = list1.flatMap(
tuple => {
//先将list1中每个元组的第一个元素用空格拆分,形成数组
val strings = tuple._1.split(" ")
println(strings.toList)
//然后将数组的每个元素与list1的每个元组的值一一对应,映射为Map
strings.map(i => (i,tuple._2))
}
)
println(list2)
val list3 = list2.map(kv => (kv._1+" ")*kv._2)
println(list3)
val list4 = list3.flatten(i => i.split(" "))
println(list4)
val list5 = list4.groupBy(i => i)
println(list5)
val list6 = list5.map(kv => (kv._1,kv._2.length))
println(list6)
val list7 = list6.toList.sortWith((a,b) => a._2>b._2)
println(list7)
}
List((hello scala,1), (hello cindy,2), (hello alice,1), (scala cindy,2), (scala alice,3), (hello,1))
List(hello, scala)
List(hello, cindy)
List(hello, alice)
List(scala, cindy)
List(scala, alice)
List(hello)
List((hello,1), (scala,1), (hello,2), (cindy,2), (hello,1), (alice,1), (scala,2), (cindy,2), (scala,3), (alice,3), (hello,1))
List(hello , scala , hello hello , cindy cindy , hello , alice , scala scala , cindy cindy , scala scala scala , alice alice alice , hello )
List(hello, scala, hello, hello, cindy, cindy, hello, alice, scala, scala, cindy, cindy, scala, scala, scala, alice, alice, alice, hello)
Map(alice -> List(alice, alice, alice, alice), scala -> List(scala, scala, scala, scala, scala, scala), cindy -> List(cindy, cindy, cindy, cindy), hello -> List(hello, hello, hello, hello, hello))
Map(alice -> 4, scala -> 6, cindy -> 4, hello -> 5)
List((scala,6), (hello,5), (alice,4), (cindy,4))
法3:先扁平化映射为单个单词的元组组成的List,然后按单词分组,再分类汇总每个单词的个数,最后排序输出
object Main extends App {
val list1 = List(("hello scala",1),("hello cindy",2),("hello alice",1),("scala cindy",2),("scala alice",3),("hello",1))
println(list1)
//将list1的每个元组进行处理,然后扁平化输出
val list2:List[(String,Int)] = list1.flatMap(
tuple => {
//先将list1中每个元组的第一个元素用空格拆分,形成数组
val strings = tuple._1.split(" ")
println(strings.toList)
//然后将数组的每个元素与list1的每个元组的值一一对应,映射为Map
strings.map(i => (i,tuple._2))
}
)
println(list2)
//将list2按照单词分组
val list3 = list2.groupBy(kv => kv._1)
println(list3)
//取分组后的Map中每个元组的第一个元素为key,对第二个元素进行处理,将第二个元素中每个键值对的的value值加起来,即分类汇总
val list4 = list3.mapValues(list => list.map(_._2).sum)
println(list4.toList.sortWith((a,b) => a._2 > b._2))
}
List((hello scala,1), (hello cindy,2), (hello alice,1), (scala cindy,2), (scala alice,3), (hello,1))
List(hello, scala)
List(hello, cindy)
List(hello, alice)
List(scala, cindy)
List(scala, alice)
List(hello)
List((hello,1), (scala,1), (hello,2), (cindy,2), (hello,1), (alice,1), (scala,2), (cindy,2), (scala,3), (alice,3), (hello,1))
Map(alice -> List((alice,1), (alice,3)), scala -> List((scala,1), (scala,2), (scala,3)), cindy -> List((cindy,2), (cindy,2)), hello -> List((hello,1), (hello,2), (hello,1), (hello,1)))
List((scala,6), (hello,5), (alice,4), (cindy,4))