经验分享:如何运用R的MICE包对数据集中不同变量采用不同方法及跳过部分变量进行多重插补

本文介绍了如何使用R的mice包进行数据集的多重插补,重点讲解了如何仅针对高缺失值变量进行填充,以及针对不同变量选择特定插补方法,如处理共线性问题的'cart'方法。同时,提及了如何通过指定方法参数避免对完整变量的无谓操作,并给出了相关变量名操作的实例。

运用R的MICE包对数据集进行多重插补(multiple imputation),遇到两个具体需求:(1)只需针对缺失值较高的部分变量而不是全部变量进行填充(但仍想将全部变量纳入数据集中);(2)对于不同的具体变量,采用不同的多重插补具体方法(如处理存在多重共线性问题的部分变量需要采用“cart”方法)。

遍寻全网,终在一篇文章中找到解决方法,将相关内容记录分享如下。


Imputations can be created as


R> imp <- mice(nhanes2, me = c("polyreg", "pmm", "logreg", "norm"))

where function mice.impute.polyreg() is used to impute the first (categorical) variable age, mice.impute.ppm() for the second numeric variable bmi, function mice.impute.logreg() for the third binary variable hyp and function mice.impute.norm() for the numeric variable chl. The me parameter is a legal abbreviation of the method argument.


The mice() function will automatically skip imputation of variables that are complete. One of the problems in previous versions this function was that all incomplete data needed to be imputed. In mice 2.9 it is possible to skip imputation of selected incomplete variables by specifying the empty method "". This works as long as the incomplete variable that is skipped is not being used as a predictor for imputing other variables. The mice() function will detect this case, and automatically remove the variable from the predictor list. For example, suppose that we do not want to impute bmi, but still want to retain in it the imputed data. We can run the following

R> imp <- mice(nhanes2, meth = c("", "", "logreg", "norm"))

This statement runs because bmi is removed from the predictor list. When removal is not possible, the program aborts with an error message like Error in check.predictorMatrix(predictorMatrix, method, varnames, nmis, : Variable bmi is used, has missing values, but is not imputed.

*注意此处操作都需要按照各变量名的排列顺序,不能有所遗漏,否则将会有如上般的报错。


Reference: Van Buuren, S., & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. Journal of statistical software45, 1-67.

https://www.jstatsoft.org/article/download/v045i03/550

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值