### Cox比例风险模型生存分析的Python和R代码示例
#### 使用Cox比例风险模型进行生存分析的核心步骤包括以下几个方面:
1. **拟合Cox比例风险模型**
2. **计算一致性指数 (C-index)**
3. **绘制接收者操作特征曲线 (ROC 曲线)**
4. **绘制决策曲线分析 (DCA 曲线)**
5. **绘制校准曲线**
以下是详细的实现方式。
---
### 一、R语言中的实现
#### 1. 拟合Cox比例风险模型
```r
library(survival)
# 假设数据框名为data,包含变量time(时间),status(事件状态),covariates(协变量)
cox_model <- coxph(Surv(time, status) ~ covariate1 + covariate2, data = data)
summary(cox_model)
```
#### 2. 计算C-index
可以使用`survConcordance()`函数来计算C-index。
```r
c_index <- survConcordance(Surv(data$time, data$status) ~ predict(cox_model))
print(c_index$concordance)
```
#### 3. 绘制ROC曲线
通过`survivalROC`包实现ROC曲线绘制。
```r
library(survivalROC)
roc_result <- survivalROC(Stime = data$time,
status = data$status,
marker = predict(cox_model),
predictor.time = 12, # 预测时间为12个月
method = "KM")
plot(roc_result$FP, roc_result$TP, type="l", xlab="False Positive Rate", ylab="True Positive Rate", main="ROC Curve")
abline(a=0, b=1, col="red") # 对角线表示随机分类效果
```
#### 4. 绘制DCA曲线
参考提供的引用[^1],可使用`dcurves`包完成DCA曲线绘制。
```r
library(dcurves)
# 构建预测概率
data$prob1 <- c(1 - summary(survfit(cox_model, newdata=data), times=12)$surv)
# 绘制DCA曲线
dca_curve <- dcurves::dca(Surv(time, status) ~ prob1, data = data, time = 12) %>%
dcurves::as_tibble()
ggplot(dca_curve, aes(x=threshold, y=net_benefit, color=variable)) +
stat_smooth(method = "loess", se = FALSE, formula = "y ~ x", span = 0.2) +
coord_cartesian(ylim = c(0, 0.2)) +
scale_x_continuous(labels = scales::label_percent(accuracy = 1)) +
labs(x = "Risk Threshold", y = "Net Benefit", color = "") +
theme_bw()
```
#### 5. 绘制校准曲线
参考提供的引用[^2],可以通过以下代码绘制校准曲线。
```r
predicted_prob <- predictSurvProb(fit = cox_model, newdata = data, times = 12)
calibration_data <- data.frame(
observed = as.numeric(data$status[data$time >= 12]),
predicted = predicted_prob
)
ggplot(calibration_data, aes(x=predicted, y=observed)) +
geom_point(alpha = 0.5) +
geom_smooth(method = "lm", se = FALSE, color = "blue") +
geom_abline(intercept = 0, slope = 1, linetype = "dashed", color = "red") +
labs(title = "Calibration Plot", x = "Predicted Probability", y = "Observed Outcome") +
theme_minimal()
```
---
### 二、Python中的实现
#### 1. 拟合Cox比例风险模型
使用`lifelines`库实现Cox模型拟合。
```python
from lifelines import CoxPHFitter
# 数据准备
df['event'] = df['status'].astype(bool) # 将status转换为布尔型
cph = CoxPHFitter()
cph.fit(df, duration_col='time', event_col='event')
# 输出结果
cph.print_summary()
```
#### 2. 计算C-index
可以直接调用`concordance_index`方法。
```python
from lifelines.utils import concordance_index
predictions = cph.predict_expectation(df)
c_index = concordance_index(df['time'], predictions, df['event'])
print("C-index:", c_index)
```
#### 3. 绘制ROC曲线
借助`sklearn.metrics`模块实现。
```python
import numpy as np
from sklearn.metrics import roc_curve, auc
import matplotlib.pyplot as plt
fpr, tpr, thresholds = roc_curve(y_true=df['event'], y_score=cph.predict_partial_hazard(df))
roc_auc = auc(fpr, tpr)
plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, label=f'ROC curve (area = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], 'k--') # Random guess line
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC)')
plt.legend(loc='lower right')
plt.show()
```
#### 4. 绘制DCA曲线
需安装第三方库`pydca`。
```python
from pydca.dca_tools import plot_dca
# 准备数据
risk_thresholds = np.linspace(0, 1, 100)
nb = [] # Net benefit values
for threshold in risk_thresholds:
nb.append(compute_net_benefit(threshold))
# 绘图
plt.plot(risk_thresholds, nb, label='Model')
plt.axhline(0, linestyle='--', color='gray')
plt.xlabel('Threshold Probability')
plt.ylabel('Net Benefit')
plt.title('Decision Curve Analysis (DCA)')
plt.legend()
plt.show()
```
#### 5. 绘制校准曲线
使用`scikit-learn`中的`calibration_curve`功能。
```python
from sklearn.calibration import calibration_curve
import matplotlib.pyplot as plt
fraction_of_positives, mean_predicted_value = calibration_curve(
y_true=df['event'],
y_prob=cph.predict_survival_function(df).iloc[:, 12].values.flatten(),
n_bins=10
)
plt.figure(figsize=(8, 6))
plt.plot(mean_predicted_value, fraction_of_positives, marker='o', linewidth=2, label=' Calibration')
plt.plot([0, 1], [0, 1], linestyle='--', color='black', label='Perfectly Calibrated')
plt.xlabel('Mean Predicted Value')
plt.ylabel('Fraction of Positives')
plt.title('Calibration Plot')
plt.legend()
plt.show()
```
---
### 总结
以上展示了如何在R和Python中分别实现Cox比例风险模型的相关分析,包括模型拟合、C-index计算、ROC曲线、DCA曲线以及校准曲线的绘制。每种工具都有其独特的优势,在实际应用中可以根据需求灵活选择。
---