参考
首先设定:
set search_path to mimiciii;
概述
- 从各种数据库对象(表,视图等)获取元数据;
- 在单个表上执行基本查询;
- 执行基本的“联接”以合并表并提取有用的信息;
- 使用数据库“视图”提取高级信息。
数据库元数据
可以在Postgres中使用获取特定表的元数据\d+ <schema>.<tablename>。例如,以下命令输出准入表的元数据:
\d+ MIMICIII.ADMISSIONS;
患者编号
SELECT *
FROM patients;
键入q,推出查询的结果。
通过以下查询获取患者人数:
SELECT *
FROM patients;
“性别”列标识患者的性别。我们可以使用以下查询获取用于表示性别的不同值:
SELECT DISTINCT(gender)
FROM patients;
我们可以看到“ M”和“ F”是用来表示患者性别的两个字符。通过添加条件以选择性别为“ F”的行,我们可以使用此信息来获得女性患者的数量:
SELECT COUNT(*)
FROM patients
WHERE gender = 'F';
可以使用以下查询获得男性和女性患者的数量:
SELECT gender, COUNT(*)
FROM patients
GROUP BY gender;
死亡率和住院
在患者表中存储有用于记录患者是否在医院死亡的标志。使用以下查询统计死亡的患者人数:
SELECT expire_flag, COUNT(*)
FROM patients
GROUP BY expire_flag;
该数据库还在“ dod_hosp”列中包含在医院内死亡的患者的死亡日期,以及在“ dod_ssn”中的社会保障死亡记录中找到的死亡日期。来自两列的此信息都合并到“ dod”列中,优先级赋予“ dod_hosp”。请注意,该数据库包含将影响死亡率统计的成年和新生儿患者。下一部分将对患者进行分类。
患者年龄和死亡率
要确定成人死亡率,我们必须首先选择成人患者。我们将成年人定义为首次入院时年龄在15岁以上的患者。要执行此查询,我们必须结合patients表 和 admissions表来查找患者的入院日期和出生日期。我们用别名“ a”表示“admissions”,用别名“ p”表示“patients”:
SELECT p.subject_id, p.dob, a.hadm_id,
a.admittime, p.expire_flag
FROM admissions a
INNER JOIN patients p
ON p.subject_id = a.subject_id;
接下来,我们找到每个患者的最早入院日期。这需要使用两个函数:“ MIN”函数,该函数获取最小值;“ PARTITION BY”函数,其确定获取该最小值的组。确定每个患者的最早入院时间:
SELECT p.subject_id, p.dob, a.hadm_id,
a.admittime, p.expire_flag,
MIN (a.admittime) OVER (PARTITION BY p.subject_id) AS first_admittime
FROM admissions a
INNER JOIN patients p
ON p.subject_id = a.subject_id
ORDER BY a.hadm_id, p.subject_id;
病人的年龄由出生日期和首次入院日期之间的差值确定。我们可以通过将上面的查询与另一个查询结合以提供年龄来获得此信息。此外,我们将类别分配给不同的年龄:> = 15岁的成年人,其余的则分配其他类别。注意该WITH子句的使用,它使我们可以创建一个临时视图,可以在随后的行中查询该视图。
WITH first_admission_time AS
(
SELECT
p.subject_id, p.dob, p.gender
, MIN (a.admittime) AS first_admittime
, MIN( ROUND( (cast(admittime as date) - cast(dob as date)) / 365.242,2) )
AS first_admit_age
FROM patients p
INNER JOIN admissions a
ON p.subject_id = a.subject_id
GROUP BY p.subject_id, p.dob, p.gender
ORDER BY p.subject_id
)
SELECT
subject_id, dob, gender
, first_admittime, first_admit_age
, CASE
-- all ages > 89 in the database were replaced with 300
WHEN first_admit_age > 89
then '>89'
WHEN first_admit_age >= 14
THEN 'adult'
WHEN first_admit_age <= 1
THEN 'neonate'
ELSE 'middle'
END AS age_group
FROM first_admission_time
ORDER BY subject_id
我在使用该语句的时候,SQL Shell中一直没有输出,所以在pgAdmin 4 中运行。
输入密码(前面安装的时候定义的密码,我的是推荐的):postgres
按顺序点击,得到输入SQL语句的输入框:
首先输入:
set search_path to mimiciii;
然后输入查询的语句,点击运行,在下方的输出框(Data Output)就有结果。
现在可以将上面的查询与前面所述的WHERE和COUNT函数结合起来,以确定成人患者的数量,他们是否死亡以及因此确定他们的死亡率。
WITH first_admission_time AS
(
SELECT
p.subject_id, p.dob, p.gender
, MIN (a.admittime) AS first_admittime
, MIN( ROUND( (cast(admittime as date) - cast(dob as date)) / 365.242,2) )
AS first_admit_age
FROM patients p
INNER JOIN admissions a
ON p.subject_id = a.subject_id
GROUP BY p.subject_id, p.dob, p.gender
ORDER BY p.subject_id
)
, age as
(
SELECT
subject_id, dob, gender
, first_admittime, first_admit_age
, CASE
-- all ages > 89 in the database were replaced with 300
-- we check using > 100 as a conservative threshold to ensure we capture all these patients
WHEN first_admit_age > 100
then '>89'
WHEN first_admit_age >= 14
THEN 'adult'
WHEN first_admit_age <= 1
THEN 'neonate'
ELSE 'middle'
END AS age_group
FROM first_admission_time
)
select age_group, gender
, count(subject_id) as NumberOfPatients
from age
group by age_group, gender
请注意,没有“middle”患者出现-这反映出MIMIC-III不包含儿科患者数据的事实。
重症监护病房
在MIMIC-III数据库中,如果将患者转移到病房后24小时内返回ICU病房,我们将ICU住院时间定义为连续。患者ICU的转移,记录在transfers表中:
SELECT *
FROM transfers;
“ prev_careunit”和“ curr_careunit”分别包含先前和当前护理单元的名称。转移表还包括“ prev_wardid”和“ curr_wardid”列,它们分别包含先前护理单位和当前护理单位的ID。指定病房内病房的病房ID没有相应的密钥,以保护患者的健康信息。
转移表可以为每个患者提供多个条目,以提供医院各个护理单位之间所有移动的详细信息。第一次进入ICU的患者在转移表中的“ prev_careunit”列中将没有任何内容。同样,患者的最后一次转移在“ curr_careunit”中将没有任何内容。在先前和当前careunit列中均没有任何条目的表示该患者已在非重症监护室之间转移。下面显示了对一名患者的查询示例以及转移表的结果。请注意,“ intime”,“ outtime”,“ los”等列已被截断,只贴出前几列。
SELECT *
FROM transfers
WHERE HADM_ID = 112213;
服务
Services是MIMIC-III中新添加的表格,其中包含有关在患者住院期间从一项服务转移到另一项服务的信息。服务表包含的列包括“ prev_service”和“ curr_service”,分别包含先前和当前服务的名称。“转移时间”是患者从“ prev_service”移到“ curr_service”的时间。
教程问题
- 检索“ icustays”表中的“ subject_id”,“ hadm_id”,“ icustay_id”,“ intime”和“ outtime”
SELECT ie.subject_id, ie.hadm_id, ie.icustay_id,
ie.intime, ie.outtime
FROM icustays ie;
2. 使用患者表检索计算出的患者年龄。
SELECT ie.subject_id, ie.hadm_id, ie.icustay_id,
ie.intime, ie.outtime,
ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) AS age
FROM icustays ie
INNER JOIN patients pat
ON ie.subject_id = pat.subject_id;
3.将新生儿从成年患者中分开。
SELECT ie.subject_id, ie.hadm_id, ie.icustay_id,
ie.intime, ie.outtime,
ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) AS age,
CASE
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) <= 1
THEN 'neonate'
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) <= 14
THEN 'middle'
-- all ages > 89 in the database were replaced with 300
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) > 100
then '>89'
ELSE 'adult'
END AS ICUSTAY_AGE_GROUP
FROM icustays ie
INNER JOIN patients pat
ON ie.subject_id = pat.subject_id;
4. 通过合并入院表,找出患者入住ICU 之前每次住院的时间
SELECT ie.subject_id, ie.hadm_id, ie.icustay_id,
ie.intime, ie.outtime,
ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) as age,
ROUND((cast(ie.intime as date) - cast(adm.admittime as date))/365.242, 2) as preiculos,
CASE
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) <= 1
THEN 'neonate'
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) <= 14
THEN 'middle'
-- all ages > 89 in the database were replaced with 300
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) > 100
THEN '>89'
ELSE 'adult'
END AS ICUSTAY_AGE_GROUP
FROM icustays ie
INNER JOIN patients pat
ON ie.subject_id = pat.subject_id
INNER JOIN admissions adm
ON ie.hadm_id = adm.hadm_id;
5. 接下来,找到患者的死亡日期(如果适用)
SELECT ie.subject_id, ie.hadm_id, ie.icustay_id,
ie.intime, ie.outtime, adm.deathtime,
ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) as age,
ROUND((cast(ie.intime as date) - cast(adm.admittime as date))/365.242, 2) AS preiculos,
CASE
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) <= 1
THEN 'neonate'
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) <= 14
THEN 'middle'
-- all ages > 89 in the database were replaced with 300
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) > 100
THEN '>89'
ELSE 'adult'
END AS ICUSTAY_AGE_GROUP
FROM icustays ie
INNER JOIN patients pat
ON ie.subject_id = pat.subject_id
INNER JOIN admissions adm
ON ie.hadm_id = adm.hadm_id;
6. 找到那些在医院住院期间发生的死亡
SELECT ie.subject_id, ie.hadm_id, ie.icustay_id,
ie.intime, ie.outtime, adm.deathtime,
ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) AS age,
ROUND((cast(ie.intime as date) - cast(adm.admittime as date))/365.242, 2) AS preiculos,
CASE
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) <= 1
THEN 'neonate'
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) <= 14
THEN 'middle'
-- all ages > 89 in the database were replaced with 300
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) > 100
THEN '>89'
ELSE 'adult'
END AS ICUSTAY_AGE_GROUP,
-- note that there is already a "hospital_expire_flag" field in the admissions table which you could use
CASE
WHEN adm.hospital_expire_flag = 1 then 'Y'
ELSE 'N'
END AS hospital_expire_flag
FROM icustays ie
INNER JOIN patients pat
ON ie.subject_id = pat.subject_id
INNER JOIN admissions adm
ON ie.hadm_id = adm.hadm_id;
7. 查找这些死亡中有多少发生在ICU中.
SELECT ie.subject_id, ie.hadm_id, ie.icustay_id,
ie.intime, ie.outtime, adm.deathtime,
ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) AS age,
ROUND((cast(ie.intime as date) - cast(adm.admittime as date))/365.242, 2) AS preiculos,
CASE
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) <= 1
THEN 'neonate'
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) <= 14
THEN 'middle'
-- all ages > 89 in the database were replaced with 300
WHEN ROUND((cast(ie.intime as date) - cast(pat.dob as date))/365.242, 2) > 100
THEN '>89'
ELSE 'adult'
END AS ICUSTAY_AGE_GROUP,
-- note that there is already a "hospital_expire_flag" field in the admissions table which you could use
CASE
WHEN adm.hospital_expire_flag = 1 then 'Y'
ELSE 'N'
END AS hospital_expire_flag,
-- note also that hospital_expire_flag is equivalent to "Is adm.deathtime not null?"
CASE
WHEN adm.deathtime BETWEEN ie.intime and ie.outtime
THEN 'Y'
-- sometimes there are typographical errors in the death date, so check before intime
WHEN adm.deathtime <= ie.intime
THEN 'Y'
WHEN adm.dischtime <= ie.outtime
AND adm.discharge_location = 'DEAD/EXPIRED'
THEN 'Y'
ELSE 'N'
END AS ICUSTAY_EXPIRE_FLAG
FROM icustays ie
INNER JOIN patients pat
ON ie.subject_id = pat.subject_id
INNER JOIN admissions adm
ON ie.hadm_id = adm.hadm_id;