
深 度 学 习 花 书 笔记
文章平均质量分 95
摘录重点内容,力求公式推导步步详尽
PeterBishop0
一起进步!
展开
-
Linear Factor Models
CONTENTSMany of the research frontiers in deep learning involve building a probabilistic model of the input, pmodel (x)p_{\text {model }}(\boldsymbol{x})pmodel (x). Such a model can, in principle, use probabilistic inference to predict any of原创 2021-09-02 12:47:43 · 862 阅读 · 0 评论 -
Applications(4)
CONTENTSOther ApplicationsIn this section we cover a few other types of applications of deep learning that are different from the standard object recognition, speech recognition and natural language processing tasks discussed above. Part III of this boo原创 2021-08-31 07:45:01 · 419 阅读 · 0 评论 -
Applications(3)
CONTENTSNatural Language ProcessingNatural language processing (NLP) is the use of human languages, such as English or French, by a computer. Computer programs typically read and emit specialized languages designed to allow efficient and unambiguous pa原创 2021-08-29 21:43:19 · 331 阅读 · 0 评论 -
Applications(2)
CONTENTSComputer VisionComputer vision has traditionally been one of the most active research areas for deep learning applications, because vision is a task that is effortless for humans and many animals but challenging for computers ( Ballard et al. ,原创 2021-08-26 19:42:43 · 371 阅读 · 0 评论 -
Applications(1)
CONTENTSLarge-Scale Deep LearningDeep learning is based on the philosophy of connectionism: while an individual biological neuron or an individual feature in a machine learning model is not intelligent, a large population of these neurons or features ac原创 2021-08-21 19:01:33 · 473 阅读 · 2 评论 -
Practical Methodology(3)
CONTENTSDebugging StrategiesWhen a machine learning system performs poorly, it is usually difficult to tell whether the poor performance is intrinsic to the algorithm itself or whether there is a bug in the implementation of the algorithm. Machine lear原创 2021-08-20 12:34:01 · 505 阅读 · 0 评论 -
Practical Methodology(2)
CONTENTSSelecting HyperparametersMost deep learning algorithms come with many hyperparameters that control many aspects of the algorithm’s behavior. Some of these hyperparameters affect the time and memory cost of running the algorithm. Some of these h原创 2021-08-18 22:10:15 · 443 阅读 · 0 评论 -
Practical Methodology(1)
CONTENTSSuccessfully applying deep learning techniques requires more than just a good knowledge of what algorithms exist and the principles that explain how they work. A good machine learning practitioner also needs to know how to choose an algorithm for原创 2021-08-17 18:44:53 · 638 阅读 · 0 评论 -
Sequence Modeling: Recurrent and Recursive Nets(3)
CONTENTSLeaky Units and Other Strategies for Multiple Time ScalesOne way to deal with long-term dependencies is to design a model that operates at multiple time scales, so that some parts of the model operate at fine-grained time scales and can handle s原创 2021-08-15 21:08:43 · 625 阅读 · 0 评论 -
Sequence Modeling: Recurrent and Recursive Nets(1)
CONTENTSRecurrent neural networks or RNNs (Rumelhart et al., 1986a) are a family of neural networks for processing sequential data. Much as a convolutional network is a neural network that is specialized for processing a grid of values XXX such as an im原创 2021-08-12 16:25:19 · 568 阅读 · 0 评论 -
Convolutional Networks(3)
CONTENTSRandom or Unsupervised FeaturesTypically, the most expensive part of convolutional network training is learning the features. The output layer is usually relatively inexpensive due to the small number of features provided as input to this layer原创 2021-08-09 16:18:25 · 587 阅读 · 0 评论 -
Convolutional Networks(2)
CONTENTSVariants of the Basic Convolution FunctionWhen discussing convolution in the context of neural networks, we usually do not refer exactly to the standard discrete convolution operation as it is usually understood in the mathematical literature.原创 2021-08-08 21:41:18 · 513 阅读 · 0 评论 -
Convolutional Networks(1)
CONTENTSConvolutional networks ( LeCun , 1989 ), also known as convolutional neural networks or CNNs, are a specialized kind of neural network for processing data that has a known, grid-like topology. Examples include time-series data, which can be thoug原创 2021-08-07 22:13:53 · 709 阅读 · 2 评论 -
Optimization for Training Deep Models(3)
CONTENTSApproximate Second-Order MethodsFor simplicity of exposition, the only objective function we examine is the empirical risk:J(θ)=Ex,y∼p^data (x,y)[L(f(x;θ),y)]=1m∑i=1mL(f(x(i);θ),y(i))J(\boldsymbol{\theta})=\mathbb{E}_{\mathbf{x}, \mathrm{原创 2021-08-06 20:42:35 · 654 阅读 · 0 评论 -
Optimization for Training Deep Models(2)
CONTENTSBasic AlgorithmsWe have previously introduced the gradient descent (section 4.3) algorithm that follows the gradient of an entire training set downhill. This may be accelerated considerably by using stochastic gradient descent to follow the gradi原创 2021-08-05 12:21:25 · 694 阅读 · 0 评论 -
Optimization for Training Deep Models(1)
CONTENTSHow Learning Differs from Pure OptimizationOptimization algorithms used for training of deep models differ from traditional optimization algorithms in several ways.Machine learning usually acts indirectly. In most machine learning scenarios,原创 2021-08-03 22:05:33 · 481 阅读 · 0 评论 -
Regularization for Deep Learning(3)
CONTENTSParameter Tying and Parameter SharingThus far, in this chapter, when we have discussed adding constraints or penalties to the parameters, we have always done so with respect to a fixed region or point. For example, L2L^{2}L2 regularization (or原创 2021-08-02 11:16:33 · 397 阅读 · 0 评论 -
Regularization for Deep Learning(2)
CONTENTSNoise RobustnessSection 7.4 has motivated the use of noise applied to the inputs as a dataset augmentation strategy. For some models, the addition of noise with infinitesimal variance at the input of the model is equivalent to imposing a penalt原创 2021-07-28 21:12:29 · 409 阅读 · 1 评论 -
Regularization for Deep Learning(1)
CONTENTSA central problem in machine learning is how to make an algorithm that will perform well not just on the training data, but also on new inputs. Many strategies used in machine learning are explicitly designed to reduce the test error, possibly at原创 2021-07-26 21:37:08 · 646 阅读 · 0 评论 -
Deep Feedforward Networks(3)
CONTENTSBack-Propagation and Other Differentiation AlgorithmsWhen we use a feedforward neural network to accept an input xxx and produce an output y^\hat{\boldsymbol{y}}y^, information flows forward through the network. The inputs x\boldsymbol{x}x provi原创 2021-07-24 12:18:52 · 1163 阅读 · 0 评论 -
Deep Feedforward Networks(2)
CONTENTSHidden UnitsThe design of hidden units is an extremely active area of research and does not yet have many definitive guiding theoretical principles.Rectified linear units are an excellent default choice of hidden unit. Many other types of hi原创 2021-07-22 11:58:23 · 386 阅读 · 2 评论 -
Deep Feedforward Networks(1)
CONTENTSDeep feedforward networks, also often called feedforward neural networks, or multilayer perceptrons (MLPs), are the quintessential deep learning models. The goal of a feedforward network is to approximate some function f∗f^{*}f∗. For example, fo原创 2021-07-20 21:45:41 · 519 阅读 · 0 评论 -
Machine Learning Basics(3)
CONTENTSSupervised Learning AlgorithmsRoughly speaking, learning algorithms that learn to associate some input with some output, given a training set of examples of inputs xxx and outputs yyy. In many cases the outputs yyy may be difficult to collect au原创 2021-07-17 22:10:22 · 986 阅读 · 0 评论 -
Machine Learning Basics(2)
CONTENTSCapacity, Overfitting and UnderfittingThe ability to perform well on previously unobserved inputs is called generalization.The field of statistical learning theory provides some answers. If the training and the test set are collected arbitra原创 2021-07-08 13:20:25 · 794 阅读 · 3 评论 -
Machine Learning Basics(1)
CODE WORKSWork Here!CONTENTSLearning AlgorithmsA machine learning algorithm is an algorithm that is able to learn from data. But what do we mean by learning? Mitchell (1997) provides the definition "A computer program is said to learn from experience原创 2021-07-02 20:39:19 · 570 阅读 · 0 评论 -
Numerical Computation
Overflow and UnderflowThe fundamental difficulty in performing continuous math on a digital computer is that we need to represent infinitely many real numbers with a finite number of bit patterns.Rounding error(舍入误差) is problematic, especially when i原创 2021-07-02 14:21:43 · 745 阅读 · 0 评论 -
Probability and Information Theory
There are three possible sources of uncertainty:Inherent stochasticity in the system being modeledIncomplete observability(we cannot observe all of the variables that drive the behavior of the system)Incomplete modeling( When we use a model that must d.原创 2021-06-30 17:32:26 · 603 阅读 · 2 评论 -
Linear Algebra
GOALS矩阵的对⻆化分解,以及⼀般矩阵的svd分解,以及应⽤pca算法推导逆矩阵以及伪逆举证,最⼩⼆乘估计,最⼩范数估计CODE WORKSWork Here!CONTENTSScalars: A scalar is just a single numbe.We write scalars in italics.We usually give scalars lower-case variable names.Vectors: A vector is an array of原创 2021-06-27 16:32:06 · 590 阅读 · 1 评论