《Using recurrent neural network models for early detection of heart failure onset》阅读笔记

最新推荐文章于 2021-06-09 16:14:12 发布

qlzyssm

最新推荐文章于 2021-06-09 16:14:12 发布

阅读量289

点赞数

分类专栏： paper note 文章标签：人工智能机器学习深度学习

本文链接：https://blog.youkuaiyun.com/qlzyssm/article/details/105990673

版权

paper note 专栏收录该内容

12 篇文章

订阅专栏

文章目录

一、简介
二、相关工作
三、方法
三、Experiment

一、简介

这篇文章是AIMA于2017年收录的一篇文章。在这篇文章中，作者利用GRU来对EHR数据的时序关系进行建模，并在对HF早期诊断的任务上取得了较好的结果。

电子健康记录（Electronic health record，EHR）是个人的健康记录，其中包含了丰富的可用于医疗诊断的信息，如diagnosis codes，medication codes，以及procedure codes。虽然EHR数据中包含了丰富的信息，但是由于其捕获的信息结构较为复杂，广度更大，因此，很难有效利用上述信息。

在此之前的大多数利用EHR数据的模型都是基于aggregate features，例如event count，event average等，并没有考虑aggregate features之间的时序关系，如medication ordered at one time and procedure performed at another。因此，作者借鉴了NLP广泛使用的RNN模型，来对aggregate features之间的时序关系进行建模，并在对HF早期诊断的任务上超过了以往的模型，证明了aggregate features之间的时序关系对于HF早期诊断的重要性。

二、相关工作

linear models for low-dimensional data
Cox proportional hazard model
hidden Markov model
2-dimensional continuous-time hidden Markov model
graphical models with the Gaussian process
Markov jump process
Hawkes process

三、方法

在具体模型选择上，作者选择了GRU来对EHR数据的时序关系建模。模型结构较为简单，这里主要介绍一下作者如何在模型中表达medical concepts。

one-hot encode

最简单的方法就是借鉴NLP的方法，即采用one-hot的方法来表达相关的medical concepts，如图所示：
在这里插入图片描述

Grouped code vectors
Medical concept vector

在构建medical concept vectors时，作者借鉴了NLP中的embedding技术，基于训练语料采用skip-gram的方法来获取medical vector。
在这里插入图片描述

三、Experiment

Dataset

From random samples of 265 336 Sutter-PAMF patients, 4178 incident HF cases and 29 139 control patients were identified. The average number of clinical codes assigned to each patient was approximately 72, and there were 18 181 unique clinical codes (6910 diagnosis codes, 6897 medication codes, and 4374 procedure codes) in total. The full sample of 265 336 was used for training the medical concept vectors, and the incident HF cases and controls were used for all other model training and evaluation tasks.