Why do we call test-set as a way for unbiased estimation?
#UnbiasedEstmator #DeepLearning #Test-set
For Deep Learning applications, we spilt the dataset into training / validation and test sets. The test set is used for unbiased estimation of the model skills (generalization errors).
A natural question then is whether or not these estimators are “good” in any sense. One measure of “good” is “unbiasedness.”
Cases in time series forecasting with DL models with MAE as the estimator
Assume the MAE (Mean Absolute Error) estimator is
θ^=LMAE(X1,X2,⋯ ,Xn)=1n∑i=1n∣Xi−Xi^∣
\begin{aligned}
\widehat{\theta} &= \mathbb{L}_{MAE}(X_1,X_2,\cdots,X_n) \\
&= \frac{1}{n}\sum^{n}_{i=1}{|X_i-\widehat{X_i}|}
\end{aligned}
θ=LMAE(X1,X2,⋯,Xn)=n1i=1∑n∣Xi−Xi∣
where (X1,X2,⋯ )(X_1,X_2,\cdots)(X1,X2,⋯) is the test set, and the widehat version is its estimations. If its mathematical expectations are:
E(θ^)=θ^
E(\widehat{\theta}) = \widehat{\theta}
E(θ)=θ
Then, we call θ^\widehat{\theta}θ is the unbiased estimator of MAE.
To prove that MAE is a unbiased estimator:
E(θ^)=E(1n∑i=1n∣Xi−Xi^∣)=1n∑i=1nE(∣Xi−Xi^∣)=1n×n×θ^=θ^ \begin{aligned} E(\widehat{\theta}) &= E(\frac{1}{n}\sum^{n}_{i=1}{|X_i-\widehat{X_i}|}) \\ & = \frac{1}{n}\sum^{n}_{i=1}{E(|X_i-\widehat{X_i}|)}\\ & = \frac{1}{n}\times n \times \widehat{\theta} \\ & = \widehat{\theta} \end{aligned} E(θ)=E(n1i=1∑n∣Xi−Xi∣)=n1i=1∑nE(∣Xi−Xi∣)=n1×n×θ=θ
Cases with MSE
MSE is also a unbiased estimator.
θ^=LMSE(X1,X2,⋯ ,Xn)=1n∑i=1n(Xi−Xi^)2
\begin{aligned}
\widehat{\theta} &= \mathbb{L}_{MSE}(X_1,X_2,\cdots,X_n) \\
&= \frac{1}{n}\sum^{n}_{i=1}{(X_i-\widehat{X_i})^2}
\end{aligned}
θ=LMSE(X1,X2,⋯,Xn)=n1i=1∑n(Xi−Xi)2
E(θ^)=E(1n∑i=1n(Xi−Xi^)2)=1nE(∑i=1n(Xi−Xi^)2)=1n∑i=1nE((Xi−Xi^)2)=θ^
\begin{aligned}
E(\widehat{\theta})&= E(\frac{1}{n}\sum^{n}_{i=1}(X_i-\widehat{X_i})^2 )\\
&= \frac{1}{n} E(\sum^{n}_{i=1}(X_i-\widehat{X_i})^2) \\
&= \frac{1}{n} \sum^{n}_{i=1}E((X_i-\widehat{X_i})^2) \\
& = \widehat{\theta}
\end{aligned}
E(θ)=E(n1i=1∑n(Xi−Xi)2)=n1E(i=1∑n(Xi−Xi)2)=n1i=1∑nE((Xi−Xi)2)=θ
Conclusion
To say that test set is used for unbiased estimation, it is not about the dataset itself, but about the estimator and estimated object, i.e., mean value and μ\muμ, respectively.

本文探讨了在深度学习中,为何将测试集用于无偏估计,通过举例MAE和MSE的无偏性来解释。它强调了无偏估计器的重要性,如MAE的期望等于其估计值,证明了测试集在估计模型性能时的角色。
1675

被折叠的 条评论
为什么被折叠?



