BIOS14: Simple linear regression(一元线性回归) using R

本文介绍了使用R进行一元线性回归分析,包括线性关系、最小二乘法、拟合优度、标准误差估计、显著性检验等概念。此外,还探讨了皮尔逊相关、斯皮尔曼相关及其适用场景,并通过实例展示了线性回归模型的建立、假设检验和残差分析。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

NOTES

1 The relationship between variables

function relation: One can be described by y = f ( x ) y=f(x) y=f(x).
correlation: There is no completely confirmed relaionship between two variables.

In the most cases, two variables can not describes by a function, but we can analysis their correlativity.

  1. scatter plot: Visual inspection.

  2. correlation coefficient: according to the data, calculating the degree of correlativity between variables.
    r = n ∑ x y − ∑ x ∑ y n ∑ x 2 − ( ∑ x ) 2 ⋅ n ∑ y 2 − ( ∑ y ) 2 r= \frac{n\sum{xy}-\sum{x}\sum{y}}{\sqrt{n\sum{x^2}-(\sum{x})^2}\cdot\sqrt{n\sum{y^2}-(\sum{y})^2}} r=nx2(x)2 ny2(y)2 nxyxy
    Here, r r r is defined as linear correlation coefficient or Pearson’s correlation coefficient
    ∣ r ∣ ≥ 0.8 \vert r \vert \geq 0.8 r0.8: high correlation
    0.5 ≤ ∣ r ∣ < 0.8 0.5\leq \vert r \vert \lt 0.8 0.5r<0.8: moderate correlation
    0.3 ≤ ∣ r ∣ < 0.5 0.3\leq \vert r \vert \lt 0.5 0.3r<0.5: low correlation
    ∣ r ∣ < 0.3 \vert r \vert \lt 0.3 r<0.3: no correlation

  • When we use Pearson’s correlation coefficient, the data should be normality. If not, we can use Spearman correlation coefficient instaed. More details.
  1. significance testing of r r r
    population correlationship coefficient: ρ \rho ρ
    sample correlationship coefficient: r r r
    hypothesis:
    H 0 : ρ = 0 H_0:\rho=0 H0:ρ=0; H 1 : ρ ≠ 0 H_1:\rho\ne0 H1:ρ=0
    statistics:
    t = ∣ r ∣ n − 2 1 − r 2 ∼ t ( n − 2 ) t= \vert r \vert \sqrt{\frac{n-2}{1-r^2}}\sim t(n-2) t=r1r2n2 t(n2)
    decision
    If ∣ t ∣ > t α / 2 \vert t \vert>t_{\alpha/2} t>tα/2, reject H 0 H_0 H0, there is a significantly linear ralationship between population variables.

2 Unary linear regression

2.1 Regression model

Population regression equation:
y i = β 0 + β 1 x i + ε i y_i=\beta_0+\beta_1x_i+\varepsilon_i yi=β0+β1xi+εi
Estimated regression equation:
y ^ i = β ^ 0 + β ^ 1 x i \hat y_i=\hat\beta_0+\hat\beta_1x_i y^i=β^0+β^1xi
Assumption:
A1: dependent variable and independent variable is linear.
A2: The variable x is not random and must take at least two different values.
A3: E ( ε ) = 0 E(\varepsilon)=0 E(ε)=0
A4: The variation is constant around the regression line, independent of x. v a r ( y ) = σ 2 var(y) =\sigma^2 var(y)=σ

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值