Array(20) 和 Array.apply(null, {length: 20})

博客介绍了两种创建长度为20数组的方法,Array(20)创建的数组元素为empty,不能完成遍历;Array.apply(null, { length: 20 })创建的数组元素为undefined,可以完成遍历。还指出在比较元素时二者区别不大,但涉及数组遍历方法时差异明显。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1.Array(20)

在这里插入图片描述

其结果是: 创建了一个长度为20,但元素均为 empty 的数组。

2.Array.apply(null, { length: 20 })

在这里插入图片描述
其结果是: 创建了一个长度为20,但元素均为 undefined 的数组。

3.异同

3.1相同

console.log(arr1[0] == arr2[0]) // true
console.log(arr1[0] === arr2[0]) // true

3.2不同

使用 Array(20) 的数组不能完成遍历
使用 Array.apply(null, { length: 20 }) 可以完成遍历

empty组成的数组没有遍历,而undefined遍历了。如果只是比较每个元素,没什么区别。但是涉及到数组遍历方法的时候,却大不一样。

empty 就是数组的一个占位符,只是撑撑场面,充个数。所以,不能完成遍历。而undefined 代表了数组每一项已经初始化,只是没有赋一个明确的值,所以,遍历的时候还是会get到。

原文

现有udf val pearsonCorr = udf { (x: Seq[Double], y: Seq[Double]) => require(x.length == y.length, "数组长度必须相同") val n = x.length if (n <= 1) Double.NaN // 需要至少2个点计算相关性 else { val sumX = x.sum val sumY = y.sum val sumXY = (x zip y).map { case (a, b) => a * b }.sum val sumX2 = x.map(a => a * a).sum val sumY2 = y.map(b => b * b).sum val numerator = n * sumXY - sumX * sumY val denominator = math.sqrt(n * sumX2 - sumX * sumX) * math.sqrt(n * sumY2 - sumY * sumY) if (denominator == 0) Double.NaN else numerator / denominator } } 在dataworks上跑scala编写的spark任务,报错 2025-07-29 17:07:02,618 ERROR org.apache.spark.deploy.yarn.ApplicationMaster - User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 41 in stage 638.0 failed 4 times, most recent failure: Lost task 41.3 in stage 638.0 (TID 27945, 3010c5210.cloud.c8.am301, executor 12): org.apache.spark.SparkException: Failed to execute user defined function($anonfun$48: (array<double>, array<double>) => double) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:216) at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$2.apply(ShuffleExchangeExec.scala:279) at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$2.apply(ShuffleExchangeExec.scala:250) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: requirement failed: ???????? at scala.Predef$.require(Predef.scala:224) at com.bmsoft.operate.VsZUserOverlimitEventPeriodNew$$anonfun$48.apply(VsZUserOverlimitEventPeriodNew.scala:1894) at com.bmsoft.operate.VsZUserOverlimitEventPeriodNew$$anonfun$48.apply(VsZUserOverlimitEventPeriodNew.scala:1893) ... 21 more 怎么将udf方法改下
最新发布
07-30
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值