基于循环神经网络的反事实预测-正弦波数据
1. 处理(treat)的采纳方式
1.1. 同时处理
#' This function models simultaneuous adaption and produces the desired binary mask.
#' @param M Matrix of observed entries. The input should be N (number of units) by T (number of time periods).
#' @param N_t Number of treated units desired.
#' @param T0 The time just before treatment for all treated units. For instance, if T0 = 2, then first two entries of treated units are counted as control and the rest are treated.
#' @param treat_indices Optional indices for treated units. The default is sampling N_t unit from all N units randomly. However, user can manually set some units as treated.
#' @return The masked matrix which is one for control units and treated units before treatment and zero for treated units after treatment.
#' @examples
#' simul_adapt(M = replicate(5,rnorm(5)), N_t = 3, T0 = 3)
simul_adapt <- function(M, N_t, T0, treat_indices=0){
N = nrow(M)
T = ncol(M)
treat_mat <- matrix(1L, N, T);
if(treat_indices[1] == 0){
treat_indices <- sample(1:N, N_t)
}
for (i in 1:N_t){
treat_mat[treat_indices[i],(T0+1):T] = 0
}
return(treat_mat)
}
1.2. 分阶段处理
#' This function models staggered adaption and produces the desired binary mask.
#' @param M Matrix of observed entries. The input should be N (number of units) by T (number of time periods).
#' @param N_t Number of treated units desired.
#' @param T0 The first treatment time. The rest of treatment times are equally spaced between T0 to T.
#' @param treat_indices Optional indices for treated units. The default is sampling N_t unit from all N units randomly. However, user can manually set some units as treated. Note that indices should be sorted increasingly based on their T0.
#' @examples
#' stag_adapt(M = replicate(5,rnorm(5)), N_t = 3, T0 = 3)
stag_adapt <- function(M, N_t, T0, treat_indices=0){
N = nrow(M)
T = ncol(M)
treat_mat <- matrix(1L, N, T);
if(treat_indices[1] == 0){
treat_indices <- sample(1:N, N_t)
}
for (i in 1:N_t){
last_cont_time_pr = floor(T0+(T-T0)*(i-1)/N_t);
treat_mat[treat_indices[i],(last_cont_time_pr+1):T]=0;
}
return(treat_mat)
}
2. 正弦波数据的预测
2.1. 准备
2.1.1. 资源文件加载
###################################################
# 加载资源文件,library() 函数用于加载指定的R包。 #
###################################################
library(MCPanel) # 多维面板数据分析
library(glmnet) # 广义线性模型,特别是用于lasso和弹性网络的正则化方法
library(reshape2) # 数据重塑和整理,特别适用于长格式和宽格式数据之间的转换
library(parallel) # 设置并行处理,并行计算可以显著加快某些任务的执行速度
library(doParallel) # 实现并行计算,尤其是通过 foreach 循环实现并行化计算的能力
2.1.2. 并行计算集群
###################################################
# 并行计算集群 #
###################################################
cores <- parallel::detectCores() # 检测当前系统中可用的CPU核心数量,存储在变量 cores 中
print(paste0('cores registered: ', cores)) # paste0() 将多字符串连成一个字符串,print() 打印结果
cl <- makePSOCKcluster(cores) # 基于 PSOCK方法创建一个并行计算集群
doParallel::registerDoParallel(cores) # 注册并行计算的后端,即指定使用的并行计算集群
RNGkind("L'Ecuyer-CMRG") # "L'Ecuyer-CMRG"是一种随机数生成器类型,RNGkind() 设置随机数生成器类型
2.2. 函数SineSim
正弦波数据
###################################################
# 函数SineSim (Y,N,T,sim,nruns,d='sine') #
# Y 表示所有时间序列数据【Nbig * T】, N 表示选用模拟的个体数量, T 表示时间序列总时长
# sim 是逻辑值, 在初始治疗时间点t0后,多个个体会采纳治疗,采纳方式分为 同时 和 分阶段
# nruns 模拟运行的次数
# d 默认为 'sine' , 表示采用的是正弦波时间序列数据
###################################################
SineSim <- function(Y,N,T,sim,nruns,d='sine'){
Nbig