Machine Learning 01 - Basic Concept

本文为Stanford大学吴恩达教授机器学习课程的学习笔记,涵盖了机器学习定义、常见类型如监督学习与非监督学习,以及模型表示、代价函数等核心概念,并详细介绍了梯度下降算法的工作原理。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

最近开始学习Stanford吴恩达的机器学习课程,常做笔记,以便复习巩固。
鄙人才疏学浅,如有错漏与想法,还请多包涵,指点迷津。

Week 01

Introduction

  • Application of machine learing
    • Database mining
    • Applications can’t program by hand
    • Customizing programs
  • Definition
    • Arthur Samuen. Field of study that gives computers the ability to learn without being explicitly programmed.
    • Tom Mitchell. A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T as measured by P improves with experience E.
  • Common Types
    • Supervised Learning
      • Given the “right answer” for each example in the data.
      • Regreession Problem : Predict real-value output.
      • Classification : Predict discrete output.
    • Unsupercised learning
      • Unsupervised learning allows us to approach problems with little or no idea what the effect of the variables is.
      • Clustering、Non-clustering

Model and Cost Function

  • Basic Model Representation
    • number of training data - m
    • Input - x
    • Output - y
    • Input space - X
    • Output space - Y
    • Hypothesis - h:X->Y
  • Cost Function
    • Cost function measure the accuracy of our hypothesis function, for example :
      J(θ0,θ1)=12mi=1m(yi^yi)2=12mi=1m(hθ(xi)yi)2J(θ0,θ1)=12m∑i=1m(yi^−yi)2=12m∑i=1m(hθ(xi)−yi)2
  • Contour plot
    • A contour plot is a graph that contains many contour lines. A contour line of a two variable function has a constant value of the same line.

Parameter Learning

  • Outline

    • Start with some θ0,θ1θ0,θ1
    • Keep changing θ0,θ1θ0,θ1 to reduce J(θ0,θ1)J(θ0,θ1) until we hopefully end up at a minimum.
  • Algorithm

    repeat until convergence {

    θj:=θjαθ0J(θ0,θ1,...,θn)(for j = 0, 1, ..., n)θj:=θj−α∂∂θ0J(θ0,θ1,...,θn)(for j = 0, 1, ..., n)

    }
    simultaneous update {
    temp0:=θ0αθ0J(θ0,θ1,...,θn)θ0−α∂∂θ0J(θ0,θ1,...,θn)

    tempn:=θnαθ0J(θ0,θ1,...,θn)θn−α∂∂θ0J(θ0,θ1,...,θn)
    θ0θ0:=temp0

    θnθn:=tempn
    }
    • Comprehension
      • It’s like going down the hill in the fastest way. The differential give us a direction to move towards, and the αα(which is called learning rate) means the size of each step.
      • As we approach the bottom of our convex function, the derivative will tend to be 0, and at the bottom we have θ1:=θ1α×0θ1:=θ1−α×0.
  • Gradient descent for linear regression - Algorthm1

    repeat until convergence (simultaneously update){
    θ0:=θ0α1mi=1m(hθ(x(i))yi)θ0:=θ0−α1m∑mi=1(hθ(x(i))−yi)
    θ1:=θ1α1mi=1m(hθ(x(i))yi)θ1:=θ1−α1m∑mi=1(hθ(x(i))−yi)
    }

    • This method looks at every example in the entire training set on every step, and is called batch gradient descent.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值