Thinking about a paper "A Refinement Approach to Handling Model Misfit in Text Categorization"

无过拟合的决策线探究

最新推荐文章于 2025-09-12 17:54:38 发布

原创最新推荐文章于 2025-09-12 17:54:38 发布 · 895 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#classification #structure #training #input #express #character

IR & NLP & TC 专栏收录该内容

5 篇文章

订阅专栏

本文探讨了一种新颖的方法，通过改进分类器来提升性能，并未观察到过拟合现象。文中提及的方法生成了类似分形的决策边界，作者猜测这可能是因为这种方法能够更好地表达目标决策的真实结构。

    in this paper, i think the most interesting thing is why there is no overfitting? overfitting is inevitable when training on training examples too much. because decision line(surface) fits specified property owned exclusively by these training examples, generality has been lost. in this paper, the authors use a refinement on classifier(Simon Haykin's book <<Neural Networks: A Comprehensive Foundation>> sec 7.5 shows a similar example: boosting by filtering. but difference should be noticed: this paper's method don't vote, it is not a boosting) to improving classifier performance. in the view of decision line, whatever refinement it is, the effect must be adjusted decision line to fitting training examples better. surprising experiments results shows there is no overfitting, authors just show this fact without explanation.
    how could this be? i have noticed that this method generates decision line is different with traditonal line: you can imagine this "line": root classifier correspond to main line, its children classifers correspond to two smaller lines near the main line and so on... it is similar to fractal!! i guess these decision lines correctly express the STRUCTURE of target(real) decision, because i believe fractal is the nature of nature. but the problem still remained, ok, structure, so what? i guess:
1 structure is more general than line, however this structure is obtained by training examples but it still has more generality than line. so overfitting is alleviated.
2 to see gobal from local, this is property of fractal, similarity. training examples decision structure is similar to all examples. however regions contain arbitrary +,-.
3 this method classifier don't vote, the classification is only determined by leaves nodes, intermidate nodes only do the dispatch job: translate input to a leaves node. this behavior contradict intuition and hard to explaining. leaves nodes should be specified version of intermidate nodes, but it replaces its parents judgements without too much error.
    may be each input is specified input, so correct classification in intermidate nodes can't be viewed as a "perfect classification", it just a "estimate on imperfection".
    further work is how to take advantage of this character of "decision line" if i guess it right.