Point Estimation


Assignment 2 (DDL: 2024/10/20) 
 
1. Point Estimation (15 pts) 
The Poisson distribution is a useful discrete distribution which can be used to model the 
number of occurrences of something per unit time. For example, in networking, packet arrival 
density is often modeled with the Poisson distribution. If   is Poisson distributed, i.e., its probability mass function takes the following form: 
 
2. Source of Error: Part 1 (15 pts) 
Suppose that we are given an independent and identically distributed sample of   points { &} 
where each point  & ∼  ( , 1) is distributed according to a normal distribution with mean   
and variance 1. You are going to analyze different estimators of the mean  . 
(a) Suppose that we use the estimator  ̂= 1 for the mean of the sample, ignoring the 
observed data when making our estimate. Give the bias and variance of this estimator  ̂. 
Explainin a sentence whether this is a good estimator in general, and give an example of 
when this is a good estimator. 
(b) Now suppose that we use  ̂=  $ as an estimator of the mean. That is, we use the first 
data point in our sample to estimate the mean of the sample. Give the bias and variance 
of thisestimator  ̂. Explain in a sentence or two whether this is a good estimator or not. 
(c) In the class you have seen the relationship between the MLE estimator and the least 
squares problem. Sometimes it is useful to use the following estimate 
&'$
 
For the mean, where the parameter   > 0 is a known number. The estimator  ̂ is biased, 
but has lower variance than the sample mean  ̅=  "$ ∑&  & which is an unbiased 
estimator for  . Give the bias and variance of the estimator  ̂. 
 
3. Source of Error: Part 2 (15 pts) 
In class we discussed the fact that machine learning algorithms for function approximation 
are also a kind of estimator (of the unknown target function), and that errors in function 
approximation arise from three sources: bias, variance, and unavoidable error. In this part of 
the question you are going to analyze error when training Bayesian classifiers. Suppose that   is boolean,   is real valued,  (  = 1) = 1/2 and that the class conditional 
distributions  ( | ) are uniform distributions with  ( |  = 1) =        [1,4] and 
 ( |  = 0) =        [−4, −1]. (we use        [ ,  ] to denote a uniform probability 
distribution between   and  , with zero probability outside the interval [ ,  ]). 
(a) Plot the two class conditional probability distributions  ( |  = 0) and  ( |  = 1). 
(b) What is the error of the optimal classifier? Note that the optimal classifier knows  (  =
1) ,  ( |  = 0) and  ( |  = 1) perfectly, and applies Bayes rule to classify new 
examples. Recall that the error of a classifier is the probability that it will misclassify a new 
  drawn at random from  ( ). The error of this optimal Bayes classifier is the unavoidable 
error for this learning task. 
(c) Suppose instead that  (  = 1) = 1/2 and that the class conditional distributions are 
uniform distribution with  ( |  = 1) =        [0,4] and  ( |  = 0) =
       [−3,1]. What isthe unavoidable error in this case? Justify your answer. 
(d) Consider again the learning task from part (a) above. Suppose we train a Gaussian Naive 
Bayes (GNB) classifier using   training examples for this task, where   → ∞. Of course our 
classifier will now (incorrectly) model  ( | ) as a Gaussian distribution, so it will be 
biased: it cannot even represent the correct form of  ( | ) or  ( | ). 
Draw again the plot you created in part (a), and add to it a sketch of the learned/estimated 
class conditional probability distributions the classifier will derive from the infinite training 
data. Write down an expression for the error of the GNB. (hint: your expression will 
involve integrals - please don't bother solving them). 
(e) So far we have assumed infinite training data, so the only two sources of error are bias 
and unavoidable error. Explain in one sentences how your answer to part (d) above would 
change if the number of training examples was finite. Will the error increase or decrease? 
Which of the three possible sources of error would be present in this situation? 
 
4. Gaussian (Naïve) Bayes and Logistic Regression (15 pts) 
Recall that a generative classifier estimates  ( ,  ) =      ( ) ( | ), while a discriminative 
classifier directly estimates  ( | ). (Note that certain discriminative classifiers are nonprobabilistic:
they directly estimate a function  ∶   →   instead of  ( | ).) For clarity, we 
highlight   in bold to emphasize that it usually represents a vector of multiple attributes, i.e., 
  = { $,  +, . . . ,  %}. However, this question does not require students to derivethe answer 
in vector/matrix notation. 
In class we have observed an interesting relationship between a discriminative classifier 
(logistic regression) and a generative classifier (Gaussian naive Bayes): the form of 
 ( | )    derived from the assumptions of a specific class of Gaussian naive Bayes classifiers is 
precisely the form used by logistic regression. The derivation can be found in the required 
reading: http://www.cs.cmu.edu/~tom/mlbook/NBayesLogReg.pdf.We made the following 
assumptions for Gaussian naive Bayes classifiers to model  ( ,  ) =  ( ) ( | ): 
(1)   is a boolean variable following a Bernouli distribution, with parameter   =  (  =     1) 
and thus  (  = 0) = 1 −  . 
(2)   = { $,  +, . . . ,  %}, where each attribute  & is a continuous random variable. For each 
 & ,  ( &|  =  ) is a Gaussian distribution  ( &,,  &) . Note that  & is the standard 
deviation of the Gaussian distribution (and thus  &
+ is the variance), which does not 
depend on  . 
(3) For all   ≠  ,  & and  - are conditionally independent given  . This is why this type of 
classifier is called “naive”. We say this is a specific class of Gaussian naive Bayes classifiers because we have made an 
assumption that the standard deviation  & of  ( &|  =  ) does not depend on the value   of 
 . This is not a general assumption for Gaussian naive Bayes classifiers. 
Let's make our Gaussian naive Bayes classifiers a little more general by removing the 
assumption that the standard deviation  & of  ( &|  =  ) does not depend on  . As a result, 
for each  &,  ( &|  =  ) is Gaussian distribution  ( &,,  &,), where   = 1,2, . . . ,   and   =
0,1. Note that now the standard deviation  &, of  ( &|  =  ) depends on both the attribute 
index   and the value   of  . 
Question: is the new form of  ( | ) implied by this more general Gaussian naive Bayes 
classifier still the form used by logistic regression? Derive the new form of  ( | ) to prove 
your answer. 
 
5. Programming (40 pts) 
In this lab, please submit your code according to the following guidelines: 
(a) Cross-Validation: https://qffc.uic.edu.cn/home/content/index/pid/276/cid/6530.html 
Please try these three approaches holdout, K-fold and leave-p-out with the data file 2.1-
Exercise.csv. 
Submit ‘Exercise-handout.py’, ‘Exercise-k-fold.py’, and ‘Exercise-leave-p-out.py’ 
(b) Linear regression: https://qffc.uic.edu.cn/home/content/index/pid/276/cid/6541.html 
Please modify linear_regression_lobf.py with the data file 2.2-Exercise.csv. For this task, 
take the High column values as variables and Target column for prediction. 
Submit ‘Exercise-linear_regression_lobf.py’ 
(c) Naïve Bayes: https://qffc.uic.edu.cn/home/content/index/pid/276/cid/6557.html 
Here the dataset ‘basketball.csv’ used is for basketball games and weather conditions 
where the target is if a basketball game is played in the given conditions or not, the 
dataset is very small, just containing 14 rows and 5 columns. 
Submit ‘Exercise-NB.py’ 
(d) Logistic regression: https://qffc.uic.edu.cn/home/content/index/pid/276/cid/6556.html 
Use breast cancer from sklearn using following code: from sklearn.datasets import 
load_breast_cancer. 
Submit ‘Exercise-Logistic-Regression.py’ 

姿态估计是计算机视觉领域的一个重要研究方向,其主要目标是从图像或视频中检测和估计人体、手部或物体的三维姿态。根据应用场景和输入数据的不同,姿态估计可以分为人体姿态估计、手部姿态估计和物体姿态估计等。以下是一些常见的姿态估计技术方法和算法。 ### 2D姿态估计 2D姿态估计主要关注从单目图像中估计目标在二维图像平面中的关键点位置。最常用的方法是基于热图(heatmap)的方法,其中每个关键点对应一个热图,表示该关键点在图像中的可能位置。例如,Stacked Hourglass Networks 通过堆叠多个沙漏结构来逐步细化关键点的位置信息,从而提高估计精度[^4]。 另一种流行的方法是基于回归的方法,直接从图像特征回归关键点的坐标。这种方法通常依赖于卷积神经网络(CNN)来提取高层次的特征,并通过全连接层或其他结构输出关键点的坐标。由于其简单性和高效性,这类方法在实时应用中非常受欢迎。 ### 3D姿态估计 3D姿态估计旨在从单目图像或多视角图像中恢复目标在三维空间中的姿态。对于3D人体姿态估计,常用的方法包括基于模型的方法和基于学习的方法。 - **基于模型的方法**:这类方法通常假设存在一个已知的3D人体模型,并通过优化过程将3D模型与2D观测数据对齐。例如,SMPL(Skinned Multi-Person Linear)模型是一种广泛使用的参数化人体模型,能够通过少量参数描述复杂的人体形状和姿态变化。 - **基于学习的方法**:近年来,随着深度学习的发展,基于学习的3D姿态估计方法取得了显著进展。这些方法可以直接从图像中预测3D关键点坐标,或者通过中间表示(如2D关键点、深度图等)来辅助3D姿态的估计。例如,一些研究工作提出了使用图卷积网络(GCN)来建模关键点之间的空间关系,从而提高3D姿态估计的准确性。 ### 手部姿态估计 手部姿态估计是一个特殊的姿态估计问题,因为手部具有高度的灵活性和复杂的运动模式。根据输入数据的不同,手部姿态估计可以分为基于RGB图像的方法和基于RGB-D图像的方法。 - **基于RGB图像的方法**:这类方法通常采用多阶段的深度神经网络架构,先检测手部区域,再进一步估计手部关键点的2D或3D位置。例如,有研究提出了一种五层集成卷积神经网络,将手部姿态估计任务分解为五个单指姿态估计子任务,然后融合子任务的结果来估计完整的三维手部姿态[^3]。 - **基于RGB-D图像的方法**:这类方法利用深度信息来提高手部姿态估计的精度。例如,体积表示方法将单张深度图转换为体素形式,然后通过3D卷积、3D池化和3D反卷积等操作提取多尺度3D特征,最终回归手部关节点的空间坐标[^3]。 ### 物体姿态估计 物体姿态估计主要用于机器人抓取、增强现实等场景,通常需要估计物体在三维空间中的位置和方向(即6D姿态)。常见的方法包括基于点对特征的方法、模板匹配方法、基于学习的方法和基于三维局部特征的方法。研究表明,基于点对特征的方法目前表现最好,优于其他几种方法[^2]。 此外,还有一些方法结合了深度学习和传统几何方法的优势,例如通过神经网络预测物体的关键点或表面法线,然后利用PnP(Perspective-n-Point)算法求解物体的6D姿态。 ### 轻量级姿态估计模型 在实际应用中,尤其是在移动设备或嵌入式系统上,姿态估计模型的计算效率和内存占用成为重要的考量因素。为此,研究人员提出了多种轻量级姿态估计模型,例如MobileNet、ShuffleNet等轻量级骨干网络,结合高效的特征提取和关键点预测模块,能够在保持较高精度的同时实现快速推理。有研究表明,通过对经典轻量级模型进行算法优化,可以在CPU上实现单线程推理速度达到12ms,同时精度超越传统的热图方法[^4]。 ### 示例代码 以下是一个简单的2D姿态估计模型的PyTorch实现示例,使用了基于热图的方法: ```python import torch import torch.nn as nn import torchvision.models as models class SimplePoseNet(nn.Module): def __init__(self, num_keypoints): super(SimplePoseNet, self).__init__() self.backbone = models.resnet18(pretrained=True) self.backbone = nn.Sequential(*list(self.backbone.children())[:-2]) # 去掉最后的全连接层 self.deconv_layers = nn.Sequential( nn.ConvTranspose2d(512, 256, kernel_size=4, stride=2, padding=1), nn.ReLU(), nn.ConvTranspose2d(256, 128, kernel_size=4, stride=2, padding=1), nn.ReLU(), nn.ConvTranspose2d(128, 64, kernel_size=4, stride=2, padding=1), nn.ReLU() ) self.final_layer = nn.Conv2d(64, num_keypoints, kernel_size=1) def forward(self, x): x = self.backbone(x) x = self.deconv_layers(x) x = self.final_layer(x) return x # 创建模型实例 model = SimplePoseNet(num_keypoints=17) input_tensor = torch.randn(1, 3, 256, 256) # 输入尺寸为256x256的RGB图像 output = model(input_tensor) print(output.shape) # 输出热图的尺寸为(1, 17, 64, 64) ``` 这段代码定义了一个基于ResNet-18的轻量级姿态估计网络,包含反卷积层用于上采样特征图,最后通过一个1x1卷积层生成关键点的热图。 ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值