目的:
原因:工具能做什么
是什么:Computer simulation that generates are large number of simulated samples of data based on an assumed Data Generating Process (DGP) that characterizes the population from which the simulated samples are drawn.
- Patterns in those simulated samples are then summarized and described.
- Such patterns can be evaluated in terms of substantive theory or in terms of the statistical properties of some
estimator
是什么:Data Generating Process (DGP)
- A DGP describes how a values of a variable of interest are produced in the population.
- Most DGP’s of interest include a systematic component and a stochastic component.
- We use statistical analysis to infer characteristics of the DGP by analyzing observable data sampled from the population.
- In applied statistical work, we never know the DGP – if we did, we wouldn’t need statistical estimates of it.
- In Monte Carol simulations, we do know the DGP because we create it.
是什么:Resampling
- Like Monte Carlo simulations, resampling methods use a computer to generate a large number of simulated samples of data.
- Also like Monte Carlo simulations, patterns in these simulated samples are then summarized, and the results used
to evaluate substantive theory or statistical estimators.
- What is different is that the simulated samples are generated by drawning new samples (with replacement) from the sample of data you have
- In resampling methods, the researcher DOES NOT know or control the DGP, but the goal of learning about the DGP remains the same.
Monte Carlo Simulation of OLS
Know Your Assumptions:
set.seed(123456) # Set the seed for reproducible results
sims= 500 # Set the number of simulations at the top of the script
alpha.1 = numeric(sims) # Empty vector for storing the simulated intercepts
B.1 = numeric(sims) # Empty vector for storing the simulated slopes
a = .2 # True value for the intercept
b =.5 # True value for the slope
n = 1000 # sample size
X = runif(n, -1, 1) # Create a sample of n observations on the variable X.
# Note that this variable is outside the loop, because X
# should be fixed in repeated samples.
for(i in 1:sims)– # Start the loop
Y = a + b*X + rnorm(n, 0, 1) # The true DGP, with N(0, 1) error
model = lm(Y ˜ X) # Estimate OLS Model
alpha.1[i] = model$coef[1] # Put the estimate for the intercept
# in the vector alpha.1
B.1[i] = model$coef[2] # Put the estimate for X in the vector B.1
˝ # End loop
本文探讨了蒙特卡洛模拟与重采样方法的基本原理及其应用。通过对数据生成过程(DGP)的模拟,文章详细介绍了如何使用计算机生成大量模拟样本,并通过这些样本评估实质理论或统计估计器的特性。此外,还对比了蒙特卡洛模拟与重采样方法的区别及应用场景。
1216

被折叠的 条评论
为什么被折叠?



