P-value
The p-value is a measure used in statistical hypothesis testing to help you determine the significance of your results. Here’s a step-by-step breakdown of what it represents:
-
Null Hypothesis (H0): This is the default assumption that there is no effect or no difference. For example, if you’re testing whether a new drug is effective, the null hypothesis might state that the new drug has no effect compared to a placebo.
-
Alternative Hypothesis (H1): This is the hypothesis that there is an effect or a difference. Continuing with the drug example, the alternative hypothesis would state that the new drug is effective.
-
Calculation of the P-value: When you perform a statistical test, you calculate a p-value which quantifies the evidence against the null hypothesis. The p-value represents the probability of obtaining test results at least as extreme as the observed data, assuming that the null hypothesis is true.
假设原假设是对的,观测数据求出的统计量在原假设的分布下的概率,p value是我们能得到比观测值算出的统计量还极端的概率;如果要拒绝原假设,那么alpha至少要比p大,alpha也是一类错误,即原假设是对的,但是拒绝了原假设(本来没效果,说有效果) -
Interpretation:
- Low P-value (typically ≤ 0.05): This suggests that the observed data is unlikely under the null hypothesis, leading you to reject the null hypothesis in favor of the alternative hypothesis.
- High P-value (typically > 0.05): This indicates that the observed data is consistent with the null hypothesis, so there is not enough evidence to reject it.
如果检验出来ab两组的差异有5%,我们不能承认5%是显著的,因为原假设是没差异,我们只是拒绝了没有差异,并不是承认有5%
Confidence Level
The confidence level is associated with confidence intervals and reflects how confident you are that a parameter lies within a specified range. Here’s how it works:
根据样本数据算出来的一个区间, 总体的统计量以一个置信度(confidence level)落在这个区间内
-
Confidence Interval: This is a range of values, derived from the sample data, that is likely to contain the true population parameter (e.g., mean, proportion) with a certain level of confidence.
-
Confidence Level: This is the probability that the confidence interval contains the true population parameter. Common confidence levels are 90%, 95%, and 99%.
- 95% Confidence Level: If you were to take 100 different samples and compute a confidence interval from each sample, approximately 95 of those intervals would contain the true population parameter.
-
Interpretation: A higher confidence level means that you can be more certain that the interval contains the parameter, but it also results in a wider interval. Conversely, a lower confidence level means a narrower interval but less certainty.
Relationship Between P-value and Confidence Level
Both concepts are related to statistical inference but serve different purposes:
- The p-value helps you decide whether to reject the null hypothesis.
- The confidence level helps you estimate a range within which the true parameter lies.