ncnn源码解析（四）：模型权重载入

最新推荐文章于 2024-12-04 16:20:46 发布

MirrorYuChen

最新推荐文章于 2024-12-04 16:20:46 发布

阅读量3.5k

点赞数 3

分类专栏： ncnn

本文链接：https://blog.youkuaiyun.com/sinat_31425585/article/details/100633323

版权

ncnn 专栏收录该内容

14 篇文章

订阅专栏

前面已经大致总结了ncnn的param文件载入，根据param文件创建网络结构，然后通过bin文件载入每一层对应的网络参数。这里就总结一下，如何载入每一层的参数：

我们常用的网络参数载入的接口为：

    // 从二进制文件中载入模型
    int load_model(const char* modelpath);

找到对应net.cpp文件实现部分有：

// 从二进制文件中载入模型
int Net::load_model(const char* modelpath)
{
    FILE* fp = fopen(modelpath, "rb");
    if (!fp)
    {
        fprintf(stderr, "fopen %s failed\n", modelpath);
        return -1;
    }

    int ret = load_model(fp);

    fclose(fp);

    return ret;
}

和载入模型参数一样，ncnn模型载入这里调用了另外一个接口，从文件指针载入权重参数：

// 从文件指针载入模型
int Net::load_model(FILE* fp)
{
    // 判断当前layer是否为空
    if (layers.empty())
    {
        fprintf(stderr, "network graph not ready\n");
        return -1;
    }

    // load file
    int ret = 0;

    // 从二进制文件读取
    ModelBinFromStdio mb(fp);
    // 遍历所有的层
    for (size_t i=0; i<layers.size(); i++)
    {
        // 读取第i层
        Layer* layer = layers[i];
        
        //Here we found inconsistent content in the parameter file.
        // 如果第i层不存在
        if (!layer){
            fprintf(stderr, "load_model error at layer %d, parameter file has inconsistent content.\n", (int)i);
            ret = -1;
            break;
        }

        // 载入模型参数
        int lret = layer->load_model(mb);
        if (lret != 0)
        {
            fprintf(stderr, "layer load_model %d failed\n", (int)i);
            ret = -1;
            break;
        }

        // 从opt处创建网络的pipline
        int cret = layer->create_pipeline(opt);
        // 如果创建第i层的pipline失败
        if (cret != 0)
        {
            fprintf(stderr, "layer create_pipeline %d failed\n", (int)i);
            ret = -1;
            break;
        }
    }

    // 网络复用
    fuse_network();

    return ret;
}

按照代码注释，应该还是比较好懂得，这里需要解析两个部分，第一个部分为ModelBinFromStdio，对应于二进制模型文件解析，另外一部分为 layer->load_model(mb)，对应于具体某个层的参数载入：

（1）二进制模型文件解析

这里对应于modelbin.h和modelbin.cpp文件，首先看一下modelbin.h文件：

// Tencent is pleased to support the open source community by making ncnn available.
//
// Copyright (C) 2017 THL A29 Limited, a Tencent company. All rights reserved.
//
// Licensed under the BSD 3-Clause License (the "License"); you may not use this file except
// in compliance with the License. You may obtain a copy of the License at
//
// https://opensource.org/licenses/BSD-3-Clause
//
// Unless required by applicable law or agreed to in writing, software distributed
// under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
// CONDITIONS OF ANY KIND, either express or implied. See the License for the
// specific language governing permissions and limitations under the License.

#ifndef NCNN_MODELBIN_H
#define NCNN_MODELBIN_H

#include <stdio.h>
#include "mat.h"
#include "platform.h"

namespace ncnn {

class Net;
// 载入模型
class ModelBin
{
public:
    virtual ~ModelBin();
    // element type
    // 0 = auto
    // 1 = float32
    // 2 = float16
    // 3 = int8
    // load vec
    virtual Mat load(int w, int type) const = 0;
    // load image
    virtual Mat load(int w, int h, int type) const;
    // load dim
    virtual Mat load(int w, int h, int c, int type) const;
};

#if NCNN_STDIO
// 载入模型参数到一个Mat中
class ModelBinFromStdio : public ModelBin
{
public:
    // construct from file
    ModelBinFromStdio(FILE* binfp);

    virtual Mat load(int w, int type) const;

protected:
    FILE* binfp;
};
#endif // NCNN_STDIO

// 载入模型参数到一个Mat中
class ModelBinFromMemory : public ModelBin
{
public:
    // construct from external memory
    ModelBinFromMemory(const unsigned char*& mem);

    virtual Mat load(int w, int type) const;

protected:
    const unsigned char*& mem;
};

class ModelBinFromMatArray : public ModelBin
{
public:
    // construct from weight blob array
    ModelBinFromMatArray(const Mat* weights);

    virtual Mat load(int w, int type) const;

protected:
    mutable const Mat* weights;
};

} // namespace ncnn

#endif // NCNN_MODELBIN_H

找到对应实现部分，就是modelbin.cpp，可以看到，ModelBinFromStdio mb(fp);就是将文件指针传给binfp对象

ModelBinFromStdio::ModelBinFromStdio(FILE* _binfp) : binfp(_binfp)
{
}

下面再看一下layer载入参数，上一篇博客中有介绍，layer具体操作对应于具体类型的层操作，例如batchnorm，可以看到：

// 载入模型
int BatchNorm::load_model(const ModelBin& mb)
{
    // slope数据
    slope_data = mb.load(channels, 1);
    // 载入失败：返还-100
    if (slope_data.empty())
        return -100;

    // mean数据
    mean_data = mb.load(channels, 1);
    // 载入数据失败，返还-100
    if (mean_data.empty())
        return -100;

    // variance数据
    var_data = mb.load(channels, 1);
    // 载入数据失败，返还-100
    if (var_data.empty())
        return -100;

    // bias数据
    bias_data = mb.load(channels, 1);
    // 载入数据失败，返还-100
    if (bias_data.empty())
        return -100;

    // 创建矩阵
    a_data.create(channels);
    if (a_data.empty())
        return -100;
    // 创建矩阵
    b_data.create(channels);
    if (b_data.empty())
        return -100;

    for (int i=0; i<channels; i++)
    {
        // sqrt variance
        float sqrt_var = sqrt(var_data[i] + eps);
        a_data[i] = bias_data[i] - slope_data[i] * mean_data[i] / sqrt_var;
        b_data[i] = slope_data[i] / sqrt_var;
    }

    return 0;
}

实际上调用的是ModelBinFromStdio 的load接口：

Mat ModelBinFromStdio::load(int w, int type) const

后面type对应有四种类型：auto，float32，float16和int8

    // 0 = auto
    // 1 = float32
    // 2 = float16
    // 3 = int8

然后，根据这四种类型进行模型参数载入，感觉没什么好说的，主要是里面有个alignSize函数需要做个笔记：

static inline size_t alignSize(size_t sz, int n)
{
    return (sz + n-1) & -n;
}

这句当时是没看懂的，不过还好群里有大佬，小白大佬做了比较详细的解释：alignSize就是申请sz大小的内存，实际申请内存是y =(sz+n-1)&-n大小的内存，y >= sz，且y是n的整数倍，然后对(sz+n-1)& -n的解释是：

假设n为16，-n就是0xfffffff0，(sz+n-1)，加这个n-1一是为了保证sz刚好是16的倍数不会多算，二十为了防止不是16的倍数会少算，如，sz=3, 就是从二进制角度舍弃19小于16部分。

好吧，这里就总结到这里，后面继续。