flatten可以把嵌套的结构展开.
List(List(1,2),List(3,4)).flatten
结果: List[Int] = List(1, 2, 3, 4)
实例:
val flatten_distinct = udf(
(xs: Seq[Seq[String]]) => xs.flatten.distinct)
df.groupBy("id").agg(flatten_distinct(collect_list("users")))
.withColumnRenamed("UDF(collect_list(users, 0, 0))", "users")
.select("users")
我这里是根据用户的ID进行嵌套list的拆分
zip方法将两个集合结合在一起
List('a,'b,'c).zip(List(1,2,3))
List[(Symbol, Int)] = List(('a,1), ('b,2), ('c,3))