Kernel Methods
Kernelization
a linear classifier in a higher-dimensional space corresponds to a non-linear classifier in the original space.
this is akin to making regression more flexible by using polynomials.
A potential drawback:we might increase the computational burden since the expandation of dimension.
Chap5.2 Cover’s Theorem on the separability of patterns
Cover Theorem: A complex pattern-classification problem, cast in a high-dimensional space nonlinearly, is more likely to be linearly saparable than in a low-dimensional space, provided that the space is not densely populated.