What is an estimator?

本文介绍了统计学中的参数估计概念,包括估计量及其性质,并通过实例解释了如何使用样本均值来估计总体均值。
  • An “estimator” or “point estimate” is a statistic (that is, a function of the data) and a rule that is used to infer the value of an unknown parameter in a statistical model.

  • Any statistic whose values are used to estimate is defined to be an estimator of θ\thetaθ. If a parameter is estimated by an estimator, we usually write is as θ^\hat{\theta}θ^, where the hat indicates that we are dealing with an estimator of θ\thetaθ.

  • Estimators are random variables with a fixed (mean) and a random component (disturbance)

  • The formula or rule to calculate the mean/ variance (characteristic) from a sample is called estimator, the value is called estimate.

  • An estimator is a statistic that estimates some fact about the population. You can also think of an estimator as the rule that creates an estimate.

Example: the sample mean(x̄) is an estimator for the population mean, μ:

在这里插入图片描述

  • The quantity that is being estimated (i.e. the one you want to know) is called the estimand(如果放在回归中,可以认为就是因变量). For example, let’s say you wanted to know the average weight of students in a university with a population of 1000 students. You take a sample of 30 students, and find the mean weight is 56 kg. This is your sample mean, we call it the estimator . You can use the value of the estimator to as the population mean (your estimand), namely, about 56 inches. Sometimes the words “estimator” and “estimate” are used interchangeably.

值得一提的是,regressor 就是指回归中的自变量,还有其他叫法,如:explanatory variable, or independent variable。

Instructions: The submitted report must present work and outputs clearly separated by Question. Submit ONLY ONE zip file named LASTNAME.zip that includes pdf file, code, html, data and any other supporting or working files. Python notebook with auxiliary output (data, plots) is not an analytical report: such submission will receive a deduction. Please do not discuss this assignment in groups or messengers. Raise a support ticket for your queries. Only clarifying questions are allowed. Introduction: Short-term asset returns are challenging to predict. Efficient markets produce near￾normal daily returns with no significant correlation between rt , rt−1. This exam is a limited exercise in supervised learning. You are expected to explore multiple features of your choice, with both the original and final selected features being sufficiently numerous. Objective Your objective is to develop a model to predict positive market moves (uptrend) using machine learning techniques as outlined in the section below. Your proposed solution should be comprehensive, including detailed feature engineering and model architecture. • Choose one ticker of your choice from the index, equity, ETF, crypto token, or commodity. • Predict the trend for short-term returns using binomial classification. The dependent variable should be labeled as [0, 1], not [-1, 1]. • The analysis should be comprehensive, including detailed data preprocessing, feature engineer￾ing, model building, tuning, and evaluation. Devise your own approach for categorizing extremely small near-zero returns (e.g., drop from the training sample or group with positive/negative returns). The threshold will depend on your chosen ticker. Example: small positive returns below 0.25% can be labeled as negative. The number of features to include is a design choice, and there is no universally recommended set of features for all assets. The length of the dataset is also a design choice. For predicting short-term returns (e.g., daily moves), training and testing over a period of up to 5 years should be sufficient. Interpreting the instructions below is part of the task; the tutor will not assist in designing your computational implementation. 1 A. Maths [20 marks] 1. Gaussian RBF kernel is given as k(xi , xj ) = exp  − ||xi− 2σ xj ||2  . Suppose we have three points, z1, z2 and x; where z1 is geometrically very close to x, and z2 is geometrically far away from x. What is the value of k(z1, x) and k(z2, x)? Choose the correct answer below and explain it with reasoning. (a) k(z1, x) will be close to 1 and k(z2, x) will be close to 0. (b) k(z1, x) will be close to 0 and k(z2, x) will be close to 1. 2. What are voting classifiers in ensemble learning? B. Feature Selection Using the Funnelling Approach [20 marks] 3. Perform feature selection for a machine learning model using a multi-step process by combining techniques from filter, wrapper, and embedded methods. (a) Explain the feature selection process using the three categories of feature selection methods, step by step. (b) Justify the selection of features retained at each step. (c) Provide the final list of selected features. C. Model Building, Tuning and Evaluation [60 marks] 4. Predicting Positive Market Moves Using Support Vector Support Vector Machine (SVM), (a) Build a model to predict positive market moves (uptrend) using the feature subset derived above. (b) Tune the hyperparameters of the estimator to obtain an optimal model. (c) Evaluate the model’s prediction quality using the area under the receiver operating characteristic (ROC) curve, confusion matrix, and classification report. Note: The choice of kernel and the selection of hyperparameters to optimize are critical design decisions in developing an effective model. Submitting Python code alone, without clear explanations or context, will not be accepted. The report must present a detailed study, including methodology, analysis, and a well-reasoned conclusion. As an optional enhancement, you may consider backtesting the predicted signals within a trading strategy to assess their practical impact.请给出详细的解题步骤和答案
最新发布
10-20
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值