MICMIC-IV 查询策略(实际查询SQL例子注释)
说明通过注释学习MIMIC官方的查询语句,总结MIMIC数据库的结构和查询特点。
内容:SQL语言和mimic-iv数据库的结构
目的:通过学习mimic数据库的数据格式, 方便查询mimic数据库; 学习mimic数据库的组织形式,也能够建立并查询自己的数据库
概念和特点:
本文的许多概念和对mimic数据库的认识来源于:[1] Wang S , Mcdermott M , Chauhan G , et al. MIMIC-Extract: A Data Extraction, Preprocessing, and Representation Pipeline for MIMIC-III[J]. 2019. 并参考了通用的CDISC的相关知识。
- 静态数据, 没有时间因素的影响,多是分类变量,比如,年龄,性别等。
- 静态表格,整个表格的数据是一个模式,,为静态,通过表格连接汇总成横向的单个表格。
- 动态数据, 多条记录, 通常是随时间的不同而变化。
- 动态数据表格,每个动态表格中保存的数据都是一类, 都有多条记录,规律:其中有2-3列是“核心列”,一般是测试的名称(itemid)和测试的值(valuenum和value)是查询的主体;其余的列是辅助列,包括时间,flag等 辅助因素。
**特殊的提取策略:**比如年龄的提取,有专门的公式来进行换算,查询策略也特殊。
例:
-- The columns of the table patients: anchor_age, anchor_year, anchor_year_group
-- provide information regarding the actual patient year for the patient admission,
-- and the patient's age at that time.
-- anchor_year is a shifted year for the patient.
-- anchor_year_group is a range of years - the patient's anchor_year occurred during this range.
-- anchor_age is the patient's age in the anchor_year.
-- Example: a patient has an anchor_year of 2153,
-- anchor_year_group of 2008 - 2010, and an anchor_age of 60.
-- The year 2153 for the patient corresponds to 2008, 2009, or 2010.
-- The patient was 60 in the shifted year of 2153, i.e. they were 60 in 2008, 2009, or 2010.
-- A patient admission in 2154 will occur in 2009-2011,
-- an admission in 2155 will occur in 2010-2012, and so on.
-- Therefore, the age of a patient = hospital admission time - anchor_year + anchor_age
SELECT
ad.subject_id
, ad.hadm_id
, ad.admittime
, pa.anchor_age
, pa.anchor_year
, DATETIME_DIFF(ad.admittime, DATETIME(pa.anchor_year, 1, 1, 0, 0,0),'YEAR') + pa.anchor_age AS age
FROM mimic_core.admissions ad
INNER JOIN mimic_core.patients pa
ON ad.subject_id = pa.subject_id
;
一般的提取策略:
静态数据的提取:
1.一般策略:按照文献的策略提取并存入到表格; 一般查询,使用加减函数(比如出院时间减去入院时间,获得住院时长等)
2. 静态数据表格有(方便查询): mimic_core.admissions(缩写ad);mimic_core.patients(pa);mimic_core.transfers(); mimic_hosp.diagnoses_icd (diag)
附例:静态查询语句实例+标注
SELECT
ad.subject_id
, ad.hadm_id
, ad.admittime
, pa.anchor_age
, pa.anchor_year
, pa.gender
, ad.ethnicity--种族
FROM mimic_core.admissions ad
INNER JOIN mimic_core.patients pa--内部表连接
ON ad.subject_id = pa.subject_id--表链接条件
limit 10;
评论:1.查询命令相对简单, 不涉及复杂的换算;
动态表格数据的提取:
- 动态表格有哪些?mimic_icu.chartevents(缩写ce); mimic_hosp.labevents(缩写le);
基本策略:
step 1.多条数据变为单条数据,通过求平均值,首次值或末次值,中位数,频数等(使用聚合函数);
step 2.纵向数据变横向数据。
step 3.连接表格。
多次测量的数据有分为多个具体情形:
- 心率等生命体征和实验室检查, 求平均值比较合适;
- 频次
附:随时间变化的变量的SQL查询典型语句学习(原始语句来源于micmic官方公布的github中的语句, 略微改动):
-- create a table which has fuzzy boundaries on hospital admission
-- involves first creating a lag/lead version of disch/admit time
-- get first/last heart rate measurement during hospitalization for each stay_id(查询目的)
WITH t1 AS --with的功能是建立临时表格,可以多次引用
(
select ce.stay_id
, min(charttime) as intime_hr--获得首次和末次的测量的时间
, max(charttime) as outtime_hr
, avg(valuenum) as heart_rate
FROM mimic_icu.chartevents ce
-- only look at heart rate
where ce.itemid = 220045 --代码代表的是心率的测量(代码的含义在mimic_icu.d_items表中查询)
group by ce.stay_id--没有分号
)
-- add in subject_id/hadm_id, 以下是查询的主体,主要目的是连接表格,所以可以建立多个临时表格然后进行连接
select
ie.subject_id, ie.hadm_id, ie.stay_id
, t1.intime_hr
, t1.outtime_hr
,t1.heart_rate
FROM mimic_icu.icustays ie
left join t1 --引用了上面建立的临时表的内容
on ie.stay_id = t1.stay_id;
例2 提取CRRT治疗的相关参数,代表了治疗相关的参数查询
with crrt_settings as--临时表格,CRRT-连续肾脏替代治疗
(
select ce.stay_id, ce.charttime
, CASE WHEN ce.itemid = 227290 THEN ce.value END AS CRRT_mode
, CASE WHEN ce.itemid = 224149 THEN ce.valuenum ELSE NULL END AS AccessPressure--纵向表格变横向表格的策略, 或许可以先查询到纵向的表格,然后到Python中再进行处理(不便于SQL中结合其它的表格?)
, CASE WHEN ce.itemid = 224144 THEN ce.valuenum ELSE NULL END AS BloodFlow -- (ml/min)
, CASE WHEN ce.itemid = 228004 THEN ce.valuenum ELSE NULL END AS Citrate -- (ACD-A)
, CASE WHEN ce.itemid = 225183 THEN ce.valuenum ELSE NULL END AS CurrentGoal
, CASE WHEN ce.itemid = 225977 THEN ce.value ELSE NULL END AS DialysateFluid
, CASE WHEN ce.itemid = 224154 THEN ce.valuenum ELSE NULL END AS DialysateRate
, CASE WHEN ce.itemid = 224151 THEN ce.valuenum ELSE NULL END AS EffluentPressure
, CASE WHEN ce.itemid = 224150 THEN ce.valuenum ELSE NULL END AS FilterPressure
, CASE WHEN ce.itemid = 225958 THEN ce.value ELSE NULL END AS HeparinConcentration -- (units/mL)
, CASE WHEN ce.itemid = 224145 THEN ce.valuenum ELSE NULL END AS HeparinDose -- (per hour)
-- below may not account for drug infusion/hyperalimentation/anticoagulants infused
, CASE WHEN ce.itemid = 224191 THEN ce.valuenum ELSE NULL END AS HourlyPatientFluidRemoval
, CASE WHEN ce.itemid = 228005 THEN ce.valuenum ELSE NULL END AS PrefilterReplacementRate
, CASE WHEN ce.itemid = 228006 THEN ce.valuenum ELSE NULL END AS PostFilterReplacementRate
, CASE WHEN ce.itemid = 225976 THEN ce.value ELSE NULL END AS ReplacementFluid
, CASE WHEN ce.itemid = 224153 THEN ce.valuenum ELSE NULL END AS ReplacementRate
, CASE WHEN ce.itemid = 224152 THEN ce.valuenum ELSE NULL END AS ReturnPressure
, CASE WHEN ce.itemid = 226457 THEN ce.valuenum END AS UltrafiltrateOutput
-- separate system integrity into sub components
-- need to do this as 224146 has multiple unique values for a single charttime
-- e.g. "Clots Present" and "Active" at same time
, CASE
WHEN ce.itemid = 224146
AND ce.value IN ('Active', 'Initiated', 'Reinitiated', 'New Filter')
THEN 1
WHEN ce.itemid = 224146
AND ce.value IN ('Recirculating', 'Discontinued')
THEN 0
ELSE NULL END as system_active
, CASE
WHEN ce.itemid = 224146
AND ce.value IN ('Clots Present', 'Clots Present')
THEN 1
WHEN ce.itemid = 224146
AND ce.value IN ('No Clot Present', 'No Clot Present')
THEN 0
ELSE NULL END as clots
, CASE
WHEN ce.itemid = 224146
AND ce.value IN ('Clots Increasing', 'Clot Increasing')
THEN 1
ELSE NULL END as clots_increasing
, CASE
WHEN ce.itemid = 224146
AND ce.value IN ('Clotted')
THEN 1
ELSE NULL END as clotted
from mimic_icu.chartevents ce
where ce.itemid in
(
-- MetaVision ITEMIDs
227290, -- CRRT Mode
224146, -- System Integrity
-- 225956, -- Reason for CRRT Filter Change
-- above itemid is one of: Clotted, Line Changed, Procedure
-- only ~200 rows, not super useful
224149, -- Access Pressure
224144, -- Blood Flow (ml/min)
228004, -- Citrate (ACD-A)
225183, -- Current Goal
225977, -- Dialysate Fluid
224154, -- Dialysate Rate
224151, -- Effluent Pressure
224150, -- Filter Pressure
225958, -- Heparin Concentration (units/mL)
224145, -- Heparin Dose (per hour)
224191, -- Hourly Patient Fluid Removal
228005, -- PBP (Prefilter) Replacement Rate
228006, -- Post Filter Replacement Rate
225976, -- Replacement Fluid
224153, -- Replacement Rate
224152, -- Return Pressure
226457 -- Ultrafiltrate Output
)
and ce.value is not null
)
-- use MAX() to collapse to a single row 多条变一条策略之一
-- there is only ever 1 row for unique combinations of stay_id/charttime/itemid
--临时表结束,主查询开始
select stay_id
, charttime
, MAX(crrt_mode) AS crrt_mode
, MAX(AccessPressure) AS access_pressure
, MAX(BloodFlow) AS blood_flow
, MAX(Citrate) AS citrate--柠檬酸盐
, MAX(CurrentGoal) AS current_goal
, MAX(DialysateFluid) AS dialysate_fluid--透析
, MAX(DialysateRate) AS dialysate_rate
, MAX(EffluentPressure) AS effluent_pressure
, MAX(FilterPressure) AS filter_pressure
, MAX(HeparinConcentration) AS heparin_concentration
, MAX(HeparinDose) AS heparin_dose
, MAX(HourlyPatientFluidRemoval) AS hourly_patient_fluid_removal
, MAX(PrefilterReplacementRate) AS prefilter_replacement_rate
, MAX(PostFilterReplacementRate) AS postfilter_replacement_rate
, MAX(ReplacementFluid) AS replacement_fluid
, MAX(ReplacementRate) AS replacement_rate
, MAX(ReturnPressure) AS return_pressure
, MAX(UltrafiltrateOutput) AS ultrafiltrate_output
, MAX(system_active) AS system_active
, MAX(clots) AS clots
, MAX(clots_increasing) AS clots_increasing
, MAX(clotted) AS clotted
from crrt_settings--引用临时表格
group by stay_id, charttime
例3 炎症指标的查询,代表了实验室指标的查询过程
-- THIS SCRIPT IS AUTOMATICALLY GENERATED. DO NOT EDIT IT DIRECTLY.
--DROP TABLE IF EXISTS inflammation; CREATE TABLE inflammation AS
SELECT
MAX(subject_id) AS subject_id
, MAX(hadm_id) AS hadm_id
, MAX(charttime) AS charttime
, le.specimen_id
-- convert from itemid into a meaningful column
, MAX(CASE WHEN itemid = 50889 THEN valuenum ELSE NULL END) AS crp--C反应蛋白
-- , CAST(NULL AS NUMERIC) AS il6
-- , CAST(NULL AS NUMERIC) AS procalcitonin
FROM mimic_hosp.labevents le--实验室检查指标存储的表格之一,还有另外的存储地方(注意!)
WHERE le.itemid IN
(
50889 -- crp
-- 51652 -- high sensitivity CRP
)
AND valuenum IS NOT NULL--where语句的联合条件
-- lab values cannot be 0 and cannot be negative
AND valuenum > 0
GROUP BY le.specimen_id;
例:这是一个比较综合的例子, 查询charlson 合并症的相关资料, 通过ICD-9和10限定相关的疾病。
-- THIS SCRIPT IS AUTOMATICALLY GENERATED. DO NOT EDIT IT DIRECTLY.
--DROP TABLE IF EXISTS charlson; CREATE TABLE charlson AS
-- ------------------------------------------------------------------
-- This query extracts Charlson Comorbidity Index (CCI) based on the recorded ICD-9 and ICD-10 codes.
--
-- Reference for CCI:
-- (1) Charlson ME, Pompei P, Ales KL, MacKenzie CR. (1987) A new method of classifying prognostic
-- comorbidity in longitudinal studies: development and validation.J Chronic Dis; 40(5):373-83.
--
-- (2) Charlson M, Szatrowski TP, Peterson J, Gold J. (1994) Validation of a combined comorbidity
-- index. J Clin Epidemiol; 47(11):1245-51.
--
-- Reference for ICD-9-CM and ICD-10 Coding Algorithms for Charlson Comorbidities:
-- (3) Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining Comorbidities in ICD-9-CM
-- and ICD-10 administrative data. Med Care. 2005 Nov; 43(11): 1130-9.
-- ------------------------------------------------------------------
WITH diag AS--建立临时表格, 诊断信息
(
SELECT
hadm_id--医院的ID
, CASE WHEN icd_version = 9 THEN icd_code ELSE NULL END AS icd9_code
, CASE WHEN icd_version = 10 THEN icd_code ELSE NULL END AS icd10_code
FROM mimic_hosp.diagnoses_icd diag
)
, com AS--另外一个临时表格,基于上个临时表;
(
SELECT
ad.hadm_id
-- Myocardial infarction
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) IN ('410','412')--substr(string,start,length)函数格式
OR
SUBSTR(icd10_code, 1, 3) IN ('I21','I22')--从第一个字符开始,匹配前3个,code的解释见d_icd_dianosis表格
OR
SUBSTR(icd10_code, 1, 4) = 'I252'
THEN 1
ELSE 0 END) AS myocardial_infarct--心肌梗死
-- Congestive heart failure
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) = '428'
OR
SUBSTR(icd9_code, 1, 5) IN ('39891','40201','40211','40291','40401','40403',
'40411','40413','40491','40493')
OR
SUBSTR(icd9_code, 1, 4) BETWEEN '4254' AND '4259'
OR
SUBSTR(icd10_code, 1, 3) IN ('I43','I50')
OR
SUBSTR(icd10_code, 1, 4) IN ('I099','I110','I130','I132','I255','I420',
'I425','I426','I427','I428','I429','P290')
THEN 1
ELSE 0 END) AS congestive_heart_failure--充血性心力衰竭
-- Peripheral vascular disease,
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) IN ('440','441')
OR
SUBSTR(icd9_code, 1, 4) IN ('0930','4373','4471','5571','5579','V434')
OR
SUBSTR(icd9_code, 1, 4) BETWEEN '4431' AND '4439'
OR
SUBSTR(icd10_code, 1, 3) IN ('I70','I71')
OR
SUBSTR(icd10_code, 1, 4) IN ('I731','I738','I739','I771','I790',
'I792','K551','K558','K559','Z958','Z959')
THEN 1
ELSE 0 END) AS peripheral_vascular_disease--周围性血管疾病
-- Cerebrovascular disease
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) BETWEEN '430' AND '438'
OR
SUBSTR(icd9_code, 1, 5) = '36234'
OR
SUBSTR(icd10_code, 1, 3) IN ('G45','G46')
OR
SUBSTR(icd10_code, 1, 3) BETWEEN 'I60' AND 'I69'
OR
SUBSTR(icd10_code, 1, 4) = 'H340'
THEN 1
ELSE 0 END) AS cerebrovascular_disease--脑血管疾病
-- Dementia
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) = '290'
OR
SUBSTR(icd9_code, 1, 4) IN ('2941','3312')
OR
SUBSTR(icd10_code, 1, 3) IN ('F00','F01','F02','F03','G30')
OR
SUBSTR(icd10_code, 1, 4) IN ('F051','G311')
THEN 1
ELSE 0 END) AS dementia
-- Chronic pulmonary disease,慢性肺疾病
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) BETWEEN '490' AND '505'
OR
SUBSTR(icd9_code, 1, 4) IN ('4168','4169','5064','5081','5088')
OR
SUBSTR(icd10_code, 1, 3) BETWEEN 'J40' AND 'J47'
OR
SUBSTR(icd10_code, 1, 3) BETWEEN 'J60' AND 'J67'
OR
SUBSTR(icd10_code, 1, 4) IN ('I278','I279','J684','J701','J703')
THEN 1
ELSE 0 END) AS chronic_pulmonary_disease
-- Rheumatic disease
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) = '725'
OR
SUBSTR(icd9_code, 1, 4) IN ('4465','7100','7101','7102','7103',
'7104','7140','7141','7142','7148')
OR
SUBSTR(icd10_code, 1, 3) IN ('M05','M06','M32','M33','M34')
OR
SUBSTR(icd10_code, 1, 4) IN ('M315','M351','M353','M360')
THEN 1
ELSE 0 END) AS rheumatic_disease--风湿性疾病
-- Peptic ulcer disease
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) IN ('531','532','533','534')
OR
SUBSTR(icd10_code, 1, 3) IN ('K25','K26','K27','K28')
THEN 1
ELSE 0 END) AS peptic_ulcer_disease--消化性溃疡
-- Mild liver disease
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) IN ('570','571')
OR
SUBSTR(icd9_code, 1, 4) IN ('0706','0709','5733','5734','5738','5739','V427')
OR
SUBSTR(icd9_code, 1, 5) IN ('07022','07023','07032','07033','07044','07054')
OR
SUBSTR(icd10_code, 1, 3) IN ('B18','K73','K74')
OR
SUBSTR(icd10_code, 1, 4) IN ('K700','K701','K702','K703','K709','K713',
'K714','K715','K717','K760','K762',
'K763','K764','K768','K769','Z944')
THEN 1
ELSE 0 END) AS mild_liver_disease--轻微的肝脏疾病
-- Diabetes without chronic complication
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 4) IN ('2500','2501','2502','2503','2508','2509')
OR
SUBSTR(icd10_code, 1, 4) IN ('E100','E10l','E106','E108','E109','E110','E111',
'E116','E118','E119','E120','E121','E126','E128',
'E129','E130','E131','E136','E138','E139','E140',
'E141','E146','E148','E149')
THEN 1
ELSE 0 END) AS diabetes_without_cc--无慢性并发症的糖尿病
-- Diabetes with chronic complication
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 4) IN ('2504','2505','2506','2507')
OR
SUBSTR(icd10_code, 1, 4) IN ('E102','E103','E104','E105','E107','E112','E113',
'E114','E115','E117','E122','E123','E124','E125',
'E127','E132','E133','E134','E135','E137','E142',
'E143','E144','E145','E147')
THEN 1
ELSE 0 END) AS diabetes_with_cc--有慢性并发症的糖尿病
-- Hemiplegia or paraplegia
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) IN ('342','343')
OR
SUBSTR(icd9_code, 1, 4) IN ('3341','3440','3441','3442',
'3443','3444','3445','3446','3449')
OR
SUBSTR(icd10_code, 1, 3) IN ('G81','G82')
OR
SUBSTR(icd10_code, 1, 4) IN ('G041','G114','G801','G802','G830',
'G831','G832','G833','G834','G839')
THEN 1
ELSE 0 END) AS paraplegia--截瘫
-- Renal disease
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) IN ('582','585','586','V56')
OR
SUBSTR(icd9_code, 1, 4) IN ('5880','V420','V451')
OR
SUBSTR(icd9_code, 1, 4) BETWEEN '5830' AND '5837'
OR
SUBSTR(icd9_code, 1, 5) IN ('40301','40311','40391','40402','40403','40412','40413','40492','40493')
OR
SUBSTR(icd10_code, 1, 3) IN ('N18','N19')
OR
SUBSTR(icd10_code, 1, 4) IN ('I120','I131','N032','N033','N034',
'N035','N036','N037','N052','N053',
'N054','N055','N056','N057','N250',
'Z490','Z491','Z492','Z940','Z992')
THEN 1
ELSE 0 END) AS renal_disease--肾脏疾病
-- Any malignancy, including lymphoma and leukemia, except malignant neoplasm of skin
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) BETWEEN '140' AND '172'
OR
SUBSTR(icd9_code, 1, 4) BETWEEN '1740' AND '1958'
OR
SUBSTR(icd9_code, 1, 3) BETWEEN '200' AND '208'
OR
SUBSTR(icd9_code, 1, 4) = '2386'
OR
SUBSTR(icd10_code, 1, 3) IN ('C43','C88')
OR
SUBSTR(icd10_code, 1, 3) BETWEEN 'C00' AND 'C26'
OR
SUBSTR(icd10_code, 1, 3) BETWEEN 'C30' AND 'C34'
OR
SUBSTR(icd10_code, 1, 3) BETWEEN 'C37' AND 'C41'
OR
SUBSTR(icd10_code, 1, 3) BETWEEN 'C45' AND 'C58'
OR
SUBSTR(icd10_code, 1, 3) BETWEEN 'C60' AND 'C76'
OR
SUBSTR(icd10_code, 1, 3) BETWEEN 'C81' AND 'C85'
OR
SUBSTR(icd10_code, 1, 3) BETWEEN 'C90' AND 'C97'
THEN 1
ELSE 0 END) AS malignant_cancer--恶性肿瘤,包括淋巴瘤和白血病,不包括恶性皮肤肿瘤
-- Moderate or severe liver disease
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 4) IN ('4560','4561','4562')
OR
SUBSTR(icd9_code, 1, 4) BETWEEN '5722' AND '5728'
OR
SUBSTR(icd10_code, 1, 4) IN ('I850','I859','I864','I982','K704','K711',
'K721','K729','K765','K766','K767')
THEN 1
ELSE 0 END) AS severe_liver_disease--中重度肝疾病
-- Metastatic solid tumor
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) IN ('196','197','198','199')
OR
SUBSTR(icd10_code, 1, 3) IN ('C77','C78','C79','C80')
THEN 1
ELSE 0 END) AS metastatic_solid_tumor--转移性实体瘤
-- AIDS/HIV
, MAX(CASE WHEN
SUBSTR(icd9_code, 1, 3) IN ('042','043','044')
OR
SUBSTR(icd10_code, 1, 3) IN ('B20','B21','B22','B24')
THEN 1
ELSE 0 END) AS aids
FROM mimic_core.admissions ad
LEFT JOIN diag
ON ad.hadm_id = diag.hadm_id
GROUP BY ad.hadm_id
)
, ag AS
(
SELECT
hadm_id
, age
, CASE WHEN age <= 40 THEN 0
WHEN age <= 50 THEN 1
WHEN age <= 60 THEN 2
WHEN age <= 70 THEN 3
ELSE 4 END AS age_score
FROM mimic_derived.age
)
--以下是主查询
SELECT
ad.subject_id
, ad.hadm_id
, ag.age_score
, myocardial_infarct
, congestive_heart_failure
, peripheral_vascular_disease
, cerebrovascular_disease
, dementia
, chronic_pulmonary_disease
, rheumatic_disease
, peptic_ulcer_disease
, mild_liver_disease
, diabetes_without_cc
, diabetes_with_cc
, paraplegia
, renal_disease
, malignant_cancer
, severe_liver_disease
, metastatic_solid_tumor
, aids
-- Calculate the Charlson Comorbidity Score using the original
-- weights from Charlson, 1987.
, age_score
+ myocardial_infarct + congestive_heart_failure + peripheral_vascular_disease
+ cerebrovascular_disease + dementia + chronic_pulmonary_disease
+ rheumatic_disease + peptic_ulcer_disease
+ GREATEST(mild_liver_disease, 3*severe_liver_disease)
+ GREATEST(2*diabetes_with_cc, diabetes_without_cc)
+ GREATEST(2*malignant_cancer, 6*metastatic_solid_tumor)
+ 2*paraplegia + 2*renal_disease
+ 6*aids
AS charlson_comorbidity_index
FROM mimic_core.admissions ad
LEFT JOIN com--引用临时表
ON ad.hadm_id = com.hadm_id
LEFT JOIN ag--引用临时表
ON com.hadm_id = ag.hadm_id
;
例: ICU患者中多种方案结合时间查询实验室检查(前面有个例子,是从医院中查询实验室检查)
-- THIS SCRIPT IS AUTOMATICALLY GENERATED. DO NOT EDIT IT DIRECTLY.
-- DROP TABLE IF EXISTS kdigo_creatinine; CREATE TABLE kdigo_creatinine AS
-- Extract all creatinine values from labevents around patient's ICU stay(查询目的)
WITH cr AS--建立临时表,肌酐的值
(
SELECT
ie.hadm_id
, ie.stay_id
, le.charttime
, AVG(le.valuenum) AS creat--求平均值,
FROM mimic_icu.icustays ie
LEFT JOIN mimic_hosp.labevents le
ON ie.subject_id = le.subject_id
AND le.ITEMID = 50912--多个条件,可以在ON 语句中进行添加
AND le.VALUENUM IS NOT NULL
AND le.VALUENUM <= 150
AND le.CHARTTIME BETWEEN DATETIME_SUB(ie.intime, INTERVAL '7' DAY) AND ie.outtime
GROUP BY ie.hadm_id, ie.stay_id, le.charttime--按照时间分组
)
, cr48 AS--在上个临时表的基础上, 建立临时表,操作集中在不同时间
(
-- add in the lowest value in the previous 48 hours(目的)
SELECT
cr.stay_id
, cr.charttime
, MIN(cr48.creat) AS creat_low_past_48hr
FROM cr--引用上个临时表
-- add in all creatinine values in the last 48 hours
LEFT JOIN cr cr48
ON cr.stay_id = cr48.stay_id
AND cr48.charttime < cr.charttime
AND cr48.charttime >= DATETIME_SUB(cr.charttime, INTERVAL '48' HOUR)
GROUP BY cr.stay_id, cr.charttime
)
, cr7 AS
(
-- add in the lowest value in the previous 7 days
SELECT
cr.stay_id
, cr.charttime
, MIN(cr7.creat) AS creat_low_past_7day
FROM cr
-- add in all creatinine values in the last 7 days
LEFT JOIN cr cr7
ON cr.stay_id = cr7.stay_id
AND cr7.charttime < cr.charttime
AND cr7.charttime >= DATETIME_SUB(cr.charttime, INTERVAL '7' DAY)
GROUP BY cr.stay_id, cr.charttime
)
--以下是主查询
SELECT
cr.hadm_id
, cr.stay_id
, cr.charttime
, cr.creat
, cr48.creat_low_past_48hr
, cr7.creat_low_past_7day
FROM cr
LEFT JOIN cr48
ON cr.stay_id = cr48.stay_id
AND cr.charttime = cr48.charttime
LEFT JOIN cr7
ON cr.stay_id = cr7.stay_id
AND cr.charttime = cr7.charttime
;