Enriching Knowledge Bases with Counting Quantifiers理解

最新推荐文章于 2025-02-13 11:27:06 发布

原创最新推荐文章于 2025-02-13 11:27:06 发布 · 364 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#知识图谱

本文介绍了一种名为CINEX的系统，该系统是首个全面的从文本中抽取计数信息的方法。CINEX解决了三个主要挑战：知识库种子的不完整性、文本源中稀疏且偏斜的观察以及语言模式的高度多样性。通过两个阶段——CQ识别和CQ整合，CINEX利用来自WIKIDATA的种子训练两种模型来生成计数量化词候选，并整合表达计数或组合信息的令牌。

部署运行你感兴趣的模型镜像

Paramita Mirza, et al. ISWC 2018.
对某些术语不能确定其译名，因此暂用英文。

Couting quantifiers play an important role in question answering or knowledge base curation, but are neglected by prior work. This paper develops the first full-fledged system for extracting counting information from text, called CINEX.

CINEX successfully deals with three challenges:

non-maximal training seeds due to the incompleteness of knowledge bases;
sparse and skewed observations in text sources;
high diverstiy of liguistic patterns.

CINEX architecture is shown in figure 1. CINEX can be divided into two important stages: CQ Recgnition and CQ Consolidation. Firstly, CINEX uses the seeds from WIKIDATA and train two different models to generate CQ candidates. The models are CRF++ with n-gram features and bidirectional LSTM-CRF repectively. Then CINEX consolidates the tokens expressing counting or compositionality information into a single prediction based on mention consolidation with confidence scores and count zero.
在这里插入图片描述
Figure 1. Overview of CINEX system.

您可能感兴趣的与本文相关的镜像

Stable-Diffusion-3.5

图片生成

Stable-Diffusion

Stable Diffusion 3.5 (SD 3.5) 是由 Stability AI 推出的新一代文本到图像生成模型，相比 3.0 版本，它提升了图像质量、运行速度和硬件效率