CLASS1 — NLP INTRODUCTION SUMMARY
Applications of NLP
- Information Extraction 信息抽取
- Information Extraction & Sentiment Analysis 信息抽取与情感分析
- Machine Translation
Three kind of language technology
A. mostly solved
- spam detection
- part-of-speech(POS) tagging词性标签
- named entity recognition(NER)
B. making good progress
- sentiment analysis
- coreference resolution
- word sense disambiguous词义消歧(WSD)
C. still really hard
- question answering(QA)
- paraphrase反义句
- summarization
- dialog
What makes NLP hard?
Ambiguity — crash blossoms
Why else is NL understanding difficult?
- non-standard English
- segmentation issue
- idioms
- neologisms新词
- world knowledge
- tricky entity names
What tools do we need?
- knowledge about language
- knowledge about the world
- a way to combine knowledge sources
How we generally do this?
- probabilities models built from language data
- rough text features can often do half the job.
本文介绍了自然语言处理(NLP)的应用场景和技术分类,详细探讨了信息抽取、情感分析及机器翻译等应用,并分析了NLP面临的难题如歧义性和非标准英语等问题。此外,还讨论了解决这些问题所需的语言知识和世界知识。
1101

被折叠的 条评论
为什么被折叠?



