【WSD】the 1st week:WSD's Motivation

词语消歧(WSD)是一项旨在确定单词在特定上下文中所指意义的技术。随着信息技术的发展,未结构化信息呈指数级增长,对大规模数据进行有效处理的需求日益增加。然而,传统文本挖掘与信息检索技术在面对海量数据时暴露出局限性。WSD能够通过自动方法从大量文本中识别相关信息并排除不相关文档。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

  1. WSD’s definition
    1. Word sense disambiguation。
    2. Word sense disambiguation is the ability to computationally determine which sense of a word is activated by its use in a particular context.
  2. WSD's motivation
    1. The fast development of IT has lead to the exponential growth of unstructured information。As a result,there is an increasing urge to treat this mass of information by means of automatic methods。
    2. Hower,traditional techniques for text mining and information retrieval show their limits when they are applied to such huge collections of data. In fact, these approaches, mostly based on lexicosyntactic analysis of text, do not go beyond the surface appearance of words and, consequently, fail in identifying relevant information formulated with different wordings and in discarding documents which are not pertinent to the user needs.
    3. WSD can potentially provide a major breakthrough in the treatment of large-scale amounts data,while traditional techniques for text mining and information retrieval show their limits when applied to such huge collections of data。
  3. Why WSD is difficult?
    WSD has been described as an AI-complete problem,that is, a task whose solution is at least as hard as the most difficult problems in artificial intelligence.Its acknowledged difficulty does not originate from a single cause, but rather from a variety of factors.
    1. First, the task lends itself to different formalizations due to fundamental questions
    2. Second, WSD heavily relies on knowledge.
      • In fact, the skeletal procedure of any WSD system can be summarized as follows: given a set of words (e.g., a sentence or a bag of words), a technique is applied which makes use of one or more sources of knowledge to associate the most appropriate senses with words in context.
  4. WSD‘s task
    1. Map:identify a map from words to senses。
      • If we disregard the punctuation, we can view a text T as a sequence of words (w1, w2, . . . , wn), and we can formally describe WSD as the task of assigning the appropriate sense(s) to all or some of the words in T, that is, to identify a mapping A from words to senses, such that A(i) ⊆ SensesD(wi ), where SensesD(wi) is the set of senses encoded in a dictionary D for word wi ,1 and A(i) is that subset of the senses of wi which are appropriate in the context T.The mapping A can assign more than one sense to each word wi ∈ T, although typically only the most appropriate sense is selected, that is, | A(i) |= 1.
    2. Classification
      • word senses are the classes, and an automatic classification method is used to assign each occurrence of a word to one or more classes based on the evidence from the context and from external knowledge sources

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值