nlp

最新推荐文章于 2022-02-11 08:09:32 发布

josedoug

最新推荐文章于 2022-02-11 08:09:32 发布

阅读量259

点赞数

CC 4.0 BY-SA版权

分类专栏：统计 NLP

本文链接：https://blog.youkuaiyun.com/josedoug/article/details/99244343

统计同时被 2 个专栏收录

3 篇文章

订阅专栏

NLP

2 篇文章

订阅专栏

本文探讨了自然语言处理(NLP)与语音识别(SR)的区别，指出NLP因涉及理解层面而更为复杂。文章回顾了NLP的历史发展，包括机器翻译、问题解答系统及语法检查等应用，并讨论了非参数文本数据带来的挑战。现代NLP研究倾向于避免完全解读每个词汇，关注于语法、语义、上下文建模和响应生成等方面。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

(From Diary March 11, 2017)

reading a paper while at a banquet tonight (Bates 1993)

The paper starts with marking a distinction between SR and NLP. SR is a well defined problem and effectiveness of method can be clearly assessed. NLP is hard. It is about understanding. It is not a well defined problem and hard to evaluate effectiveness of any method.

It surveys historical development of the branch. Now I understand why AI articles tend to be very “non-parametric”. It is because textual data itself is non-parametric–too many degrees of freedom. Any modelling of it will internally host a lot of degrees of freedom and thus could appear as adhoc and heuristic, but it is really reflecting the difficulty of finding a suitable parameterization of the text data itself.

Historically NLP researchers have attempted machine translation, question answering (which uses a database to store all answers and then the problem becomes interface design (querying)), application design such as grammar/style checkers, generative model.

There was the original attempt to always deep interpret every word, this conceptually overfits. Newer methodologies let go the temptation to interpret every word completely.

Broadly in NLP there are problem categories as syntactic processing, semantic processing, context modelling, and response generation.

The problems can be solve sequentially or independently using a joint probabilistic approach.