多级诊断规则挖掘:从数据库到专家决策模拟
1. 引言
在数据挖掘领域,提取出的规则往往难以被领域专家解读。传统规则归纳方法诱导出的规则描述长度较短,不能合理地代表专家的决策过程。例如,在头痛鉴别诊断数据库中,传统方法诱导出的肌肉收缩性头痛规则为:
[
[\text{location} = \text{whole}] \land [\text{Jolt Headache} = \text{no}] \land [\text{Tenderness of M1} = \text{yes}] \to \text{muscle contraction headache}
]
而医学专家给出的规则更长:
[
\begin{align }
&[\text{Jolt Headache} = \text{no}] \
&\land ([\text{Tenderness of M0} = \text{yes}] \lor [\text{Tenderness of M1} = \text{yes}] \lor [\text{Tenderness of M2} = \text{yes}]) \
&\land [\text{Tenderness of B1} = \text{no}] \land [\text{Tenderness of B2} = \text{no}] \land [\text{Tenderness of B3} = \text{no}] \
&\land [\text{Tenderness of C1} = \text{no}] \land [\text{Tenderness of C