
文本表示
淘淘图兔兔呀
这个作者很懒,什么都没留下…
展开
-
Supervised Term Weighting for Automated Text Categorization——5. Conclusion 总结
“We have proposed supervised term weighting (STW), a term weighting methodology specifically designed for IR applications involving supervised learning, such as text categorization and text filtering. Supervised term indexing leverages on the training data原创 2020-12-31 10:40:44 · 168 阅读 · 2 评论 -
Supervised Term Weighting for Automated Text Categorization——4.3 Experimental setting 实验设置
“In all the experiments discussed in this section, stop words have been removed using the stop list provided in [5, pages 117-118]. Punctuation has been removed, all letters have been converted to lowercase, numbers have been removed, and stemming has been原创 2020-12-30 23:51:16 · 196 阅读 · 0 评论 -
Supervised Term Weighting for Automated Text Categorization——4.2 Learning method 学习方法
“The learning method used for our experiments is a support vector machine (SVM) learner as implemented in the SVM-LIGHT package (version 3.5) [4]. SVMs attempt to learn a hyperplane in |T|-dimensional space that separates the positive training examples fro原创 2020-12-30 21:52:35 · 144 阅读 · 2 评论 -
Supervised Term Weighting for Automated Text Categorization——2.2 Term selection 词选择
“Many classifier induction methods are computationally hard, and their computational cost is a function of the length of the vectors that represent the documents. It is thus of key importance to be able to work with vectors shorter than ITI, which is usual原创 2020-12-30 20:08:46 · 109 阅读 · 0 评论 -
Supervised Term Weighting for Automated Text Categorization——2.1 Term weighting 词加权
“In text categorization and other applications at the crossroads of IR and ML, term weighting is usually tackled by means of methods borrowed from text search, i.e. methods that do not involve a learning phase. Many weighting methods have been developed wi原创 2020-12-30 19:41:36 · 185 阅读 · 0 评论 -
Supervised Term Weighting for Automated Text Categorization——1. Introduction 引言
“Text categorization (TC) is the activity of automatically building, by means of machine learning (ML) techniques, automatic text classifiers, i.e. programs capable of labelling natural language texts from a domain D with thematic categories from a predefi原创 2020-12-30 11:38:16 · 166 阅读 · 0 评论 -
Supervised Term Weighting for Automated Text Categorization——Abstract 摘要
“The construction of a text classifier usually involves (i) a phase of term selection, in which the most relevant terms for the classification task are identified, (ii) a phase of term weighting, in which document weights for the selected terms are compute原创 2020-12-25 21:39:36 · 149 阅读 · 0 评论 -
Supervised and Traditional Term Weighting Methods for Automatic ~~——1. Introduction 引言
“Text categorization (TC) is the task of automatically classifying unlabelled natural language documents into a predefined set of semantic categories.”原创 2020-12-25 20:54:52 · 148 阅读 · 0 评论 -
Supervised and Traditional Term Weighting Methods for Automatic Text Categorization——Abstract 摘要
“In vector space model (VSM), text representation is the task of transforming the content of a textual document into a vector in the term space so that the document could be recognized and classified by a computer or a classifier.”在向量空间模型(VSM)里,文本表示是将文本文档原创 2020-12-25 11:49:36 · 108 阅读 · 0 评论 -
Several alternative term weighting methods for text representation~~ ——4. Experimental settings 实验设置
“In this study, we use two public text classification datasets to validate the performance of our schemes, namely Reuters-21578 and 20 Newsgroups datasets [37]. Reuters-21578 dataset has 8 different categories including 5485 training texts and 2189 test te原创 2020-12-23 23:50:10 · 176 阅读 · 0 评论 -
Several alternative term weighting ~~ ——3. Proposed unsupervised term weighting schemes 提出的无监督词项加权方案
“It should be claimed that choose an appropriate metric function used for weighting terms is the key to obtain high-quality performance of TC [7]. Although the TF–IDF weighting scheme borrows from IR field, and it ignores the available category information原创 2020-12-22 11:56:10 · 166 阅读 · 0 评论 -
Several alternative term weighting ~~ ——2. Analysis of current term weighting schemes 分析现有的词加权方案
“There is no doubt that term weighting is essential for TC task [32], which measures the importance of a term (feature) in representing the content of a text [14,15].”毫无疑问,词加权对于TC任务是重要的原创 2020-12-21 17:48:30 · 168 阅读 · 0 评论 -
Several alternative term weighting methods for text representation and ~~ ——1. Introduction 引言
“Automatic text classification (TC) technology can efficiently organize and categorize text that increasing dramatically [1], thus it eliminates a large amount of human effort [2] and attracted a wide attention in recent years [3,4].”自动文本分类技术能够有效地对急剧增加的文本原创 2020-12-16 22:59:00 · 304 阅读 · 0 评论 -
Several alternative term weighting methods for text representation and classification——Abstract 摘要
“Text representation is one kind of hot topics which support text classification (TC) tasks. It has a substantial impact on the performance of TC. Although the most famous TF-IDF is specially designed for information retrieval rather than TC tasks, it is h原创 2020-12-16 22:31:04 · 177 阅读 · 2 评论