BIOS14: Simple linear regression（一元线性回归） using R_further investigated using reduced major axis regr-优快云博客

本文链接：https://blog.youkuaiyun.com/qq_43063824/article/details/103409067

本文介绍了使用R进行一元线性回归分析，包括线性关系、最小二乘法、拟合优度、标准误差估计、显著性检验等概念。此外，还探讨了皮尔逊相关、斯皮尔曼相关及其适用场景，并通过实例展示了线性回归模型的建立、假设检验和残差分析。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

NOTES

1 The relationship between variables

function relation: One can be described by $y = f (x)$ .
correlation: There is no completely confirmed relaionship between two variables.

In the most cases, two variables can not describes by a function, but we can analysis their correlativity.

scatter plot: Visual inspection.
correlation coefficient: according to the data, calculating the degree of correlativity between variables.
$\frac{n\sum{xy}-\sum{x}\sum{y}}{\sqrt{n\sum{x^2}-(\sum{x})^2}\cdot\sqrt{n\sum{y^2}-(\sum{y})^2}}$
Here, $r$ is defined as linear correlation coefficient or Pearson’s correlation coefficient
$\vert r \vert \geq 0.8$ : high correlation
$0.5\leq \vert r \vert \lt 0.8$ : moderate correlation
$0.3\leq \vert r \vert \lt 0.5$ : low correlation
$\vert r \vert \lt 0.3$ : no correlation

When we use Pearson’s correlation coefficient, the data should be normality. If not, we can use Spearman correlation coefficient instaed. More details.

significance testing of $r$
population correlationship coefficient: $\rho$
sample correlationship coefficient: $r$
hypothesis:
$H_0:\rho=0$ ; $H_1:\rho\ne0$
statistics:
$\vert r \vert \sqrt{\frac{n-2}{1-r^2}}\sim t(n-2)$
decision
If $\vert t \vert>t_{\alpha/2}$ , reject $H_0$ , there is a significantly linear ralationship between population variables.

2 Unary linear regression

2.1 Regression model

Population regression equation:
$y_i=\beta_0+\beta_1x_i+\varepsilon_i$
Estimated regression equation:
$\hat y_i=\hat\beta_0+\hat\beta_1x_i$
Assumption:
A1: dependent variable and independent variable is linear.
A2: The variable x is not random and must take at least two different values.
A3: $E(\varepsilon)=0$
A4: The variation is constant around the regression line, independent of x. $=\sigma^2$