【徒手写机器学习算法】SGD随机梯度下降_手写sgd随机梯度下降-优快云博客

本文链接：https://blog.youkuaiyun.com/hanss2/article/details/80433717

今天来看一个非常简单的算法:SGD随机梯度下降,说实话它有些不起眼,但是当今AI算法的各个场景都能见到它的身影.应该是众多机器学习算法中最常用的优化方法.几乎当前每一个先进的(state-of-the-art)机器学习库或者深度学习库都会包括梯度下降算法的不同变种实现.

图解

SGD随机梯度下降及其变种的示意图

这里写图片描述

算法详情

这里写图片描述

用处是什么？

GD和SGD中,都会在每次迭代中更新模型的参数,使得代价函数变小.
也就是针对某个参数 $w$ 最小化某个函数 $f(x)$ .

代码

用函数指针来实现导数计算：

double derivation(double(*loss_function)(VectorXd*),VectorXd* X)
{
    VectorXd X_new;
    X_new.array() = X->array() + step;
    cout<<"Now the loss is:"<<loss_function(X)<<endl;
    return (loss_function(&X_new)-loss_function(X))/step;
}

全部代码:

//#include "csv.hpp"
#include <Eigen/Dense>
#include <iostream>
#include <vector>
//g++ sgd.cpp -o sgd -I/download/eigen

#define MAX_STEPS 20

using namespace std;
using namespace Eigen;

static double W    =  3.0;
static double step = 0.02; 
static double nita =  0.3;

double loss_function(VectorXd* X)
{
    return (W*(*X)).norm();
}

double derivation(double(*loss_function)(VectorXd*),VectorXd* X)
{
    VectorXd X_new;
    X_new.array() = X->array() + step;
    cout<<"Now the loss is:"<<loss_function(X)<<endl;
    return (loss_function(&X_new)-loss_function(X))/step;
}

void sgd(double(*loss_function)(VectorXd*))
{
    VectorXd X(5);
    X.setConstant(1.1);
    //cout<<X<<endl;
    for (int i = 0; i < MAX_STEPS; ++i)
    {
        W = W - nita*derivation(loss_function,&X);
        //cout<<derivation(loss_function,&X)<<endl;
    }
    cout<<"After "<<MAX_STEPS<<" steps iteration the W is:"<<endl;
    cout<<W<<endl;
}

int main(int argc, char const *argv[])
{
    sgd(loss_function);
    return 0;
}

编译运行

root@master:/App/优快云_blog/SGD# g++ SGD.cpp -o sgd -I/download/eigen
root@master:/App/优快云_blog/SGD# ./sgd 
Now the loss is:7.37902
Now the loss is:2.42902
Now the loss is:0.799585
Now the loss is:0.263207
Now the loss is:0.0866424
Now the loss is:0.0285209
Now the loss is:0.00938851
Now the loss is:0.0030905
Now the loss is:0.00101733
Now the loss is:0.000334885
Now the loss is:0.000110237
Now the loss is:3.62878e-05
Now the loss is:1.19452e-05
Now the loss is:3.93212e-06
Now the loss is:1.29437e-06
Now the loss is:4.26082e-07
Now the loss is:1.40257e-07
Now the loss is:4.61699e-08
Now the loss is:1.51982e-08
Now the loss is:5.00293e-09
After 20 steps iteration the W is:
6.69545e-10