本文发布在Arxiv2017,就NLP技术应用于医学信息提取的三个维度:命名实体识别和事实提取,语义关系抽取和事件提取进行简单总结论述,总结了医学信息中同义词,一词多意,文本噪声等给NLP带来的困难。每个维度从传统机器学习和目前深度学习技术应用到生物学信息提取的方法进行描述。文章个人觉得写得一般,总结不是很全面,但是其中的涉及的部分参考文献可以一看:
(1)Lishuang Li, Liuke Jin, and Degen Huang. 2015. Exploring recurrent neural networks to detect named entities from biomedical text.
In Chinese Computa-tional Linguistics and Natural Language Processing Based on Naturally Annotated Big Data,
(2)Lishuang Li, Jieqiong Zheng, Jia Wan, Degen Huang,and Xiaohui Lin. 2016.
Biomedical event extrac-tion via long short term memory networks along dynamic extended tree. In Bioinformatics and Biomedicine (BIBM), 2016 IEEE International Con-ference on. IEEE
(3)Zhenchao Jiang, Lishuang Li, and Degen Huang. 2016.
A general protein-protein interaction extraction architecture based on word representation and feature selection.
International Journal of Data Mining and Bioinformatics 14(3):276–291
(4)Daojian Zeng, Kang Liu, Yubo Chen, and Jun Zhao.2015.
Distant supervision for relation extraction via piecewise convolutional neural networks.
In EMNLP. pages 1753–1762