Bias / Variance偏差/方差
I've noticed that almost all the really good machine learning practitioners tend to have a very sophisticated understanding of bias and variance. Bias and variance is one of those concepts that's easy to learn but difficult to master. Even if you think you've seen the basic concepts of bias and variance is often more nuanced to it than you'd expect. In the deep learning era, another trend is that there's been less discussion of what's called the bias variance trade off. You might have heard this thing called the bias variance trade off, but in the deep learning era, there's less of a trade off. So we still talk about bias, we still talk about variance, but we just talk less about the bias variance trade off. Let's see what this means. Let's say you have a data set that looks like this. If you fit a straight line to the data, maybe you get a logistic regression fit to that. This is not a very good fit to the data, and so there's a cause of high bias. Or we say that this is underfitting the data.
On the opposite end, if you fit an incredibly complex classifier, maybe a deep neural network. Or a new network with a lot of hidden units, maybe you can fit the data perfectly. But that doesn't look like a great fit either. So this is a classifier with high variance, and this is overfitting the data. And there might be some classifier in between with a medium level of complexity that maybe fits a curve like that. That looks like a much more reasonable fit to the data. So that's the, and call that just right somewhere in each tree. So in a 2d example like this, with just two features, x1 and x2, you can plot the data and visualize bias and variance. In high dimensional problems, you can't plot the data and visualize the decision boundary. Instead, there are couple different metrics that we'll look at to try to understand bias and variance. So, continuing our example of cat picture classification, where that's a positive example and that's a negative example. The two key numbers to look at to understand bias and variance will be the trading set error and the dev set, or the development set error. So, for the sake of argument, let