【caffe】OpenCV Load caffe model

本文介绍如何利用OpenCV和Caffe框架实现图像分类任务,详细展示了从加载预训练模型到图像预处理、分类预测的完整流程,并解决了过程中遇到的兼容性问题。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

上一篇,我们介绍了opencv_contrib中的模块在windows下的编译,也提到了其中的dnn模块可以读取caffe的训练模型用于目标检测,这里我们具体介绍一下如何使用dnn读取caffe模型并进行目标分类。


代码如下:(代码主要来自参考[2]和[3]):

#include <opencv2/dnn.hpp>  
#include <opencv2/imgproc.hpp>  
#include <opencv2/highgui.hpp>  

#include <fstream>  
#include <iostream>  
#include <cstdlib>  

/* Find best class for the blob (i. e. class with maximal probability) */
void getMaxClass(cv::dnn::Blob &probBlob, int *classId, double *classProb)
{
	cv::Mat probMat = probBlob.matRefConst().reshape(1, 1); //reshape the blob to 1x1000 matrix  
	cv::Point classNumber;

	cv::minMaxLoc(probMat, NULL, classProb, NULL, &classNumber);
	*classId = classNumber.x;
}

std::vector<cv::String> readClassNames(const char *filename = "synset_words.txt")
{
	std::vector<cv::String> classNames;

	std::ifstream fp(filename);
	if (!fp.is_open())
	{
		std::cerr << "File with classes labels not found: " << filename << std::endl;
		exit(-1);
	}

	std::string name;
	while (!fp.eof())
	{
		std::getline(fp, name);
		if (name.length())
			classNames.push_back(name.substr(name.find(' ') + 1));
	}

	fp.close();
	return classNames;
}

int main(int argc, char **argv)
{
	void cv::dnn::initModule();

	cv::String modelTxt = "bvlc_googlenet.prototxt";
	cv::String modelBin = "bvlc_googlenet.caffemodel";
	cv::String imageFile = "space_shuttle.jpg";

	cv::dnn::Net net = cv::dnn::readNetFromCaffe(modelTxt, modelBin);

	if (net.empty())
	{
		std::cerr << "Can't load network by using the following files: " << std::endl;
		std::cerr << "prototxt:   " << modelTxt << std::endl;
		std::cerr << "caffemodel: " << modelBin << std::endl;
		std::cerr << "bvlc_googlenet.caffemodel can be downloaded here:" << std::endl;
		std::cerr << "http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel" << std::endl;
		exit(-1);
	}

	//! [Prepare blob]  
	cv::Mat img = cv::imread(imageFile, cv::IMREAD_COLOR);
	if (img.empty())
	{
		std::cerr << "Can't read image from the file: " << imageFile << std::endl;
		exit(-1);
	}

	cv::resize(img, img, cv::Size(224, 224));        
	cv::dnn::Blob inputBlob = cv::dnn::Blob(img);   //Convert Mat to dnn::Blob image batch  
	//! [Prepare blob]  

	//! [Set input blob]  
	net.setBlob(".data", inputBlob);        //set the network input  
	//! [Set input blob]  

	//! [Make forward pass]  
	net.forward();                          //compute output  
	//! [Make forward pass]  

	//! [Gather output]  
	cv::dnn::Blob prob = net.getBlob("prob");   //gather output of "prob" layer  

	int classId;
	double classProb;
	getMaxClass(prob, &classId, &classProb);//find the best class  
	//! [Gather output]  

	//! [Print results]  
	std::vector<cv::String> classNames = readClassNames();
	std::cout << "Best class: #" << classId << " '" << classNames.at(classId) << "'" << std::endl;
	std::cout << "Probability: " << classProb * 100 << "%" << std::endl;

	//! [Print results]  

	return 0;
} //main


代码详解 :
1、首先需要下载GoogLeNet模型及分类相关文件,可以从官网下载(或复制粘贴):   bvlc_googlenet.prototxtbvlc_googlenet.caffemodel以及synset_words.txt.也可以直接下载我长传的打包好的资源(包括了2中的图片)

2、下载待检测图片文件,如下:


Buran space shuttle

3、读取.protxt文件和.caffemodel文件:

cv::dnn::Net net = cv::dnn::readNetFromCaffe(modelTxt, modelBin);

4、检查网络是否读取成功:

	if (net.empty())
	{
		std::cerr << "Can't load network by using the following files: " << std::endl;
		std::cerr << "prototxt:   " << modelTxt << std::endl;
		std::cerr << "caffemodel: " << modelBin << std::endl;
		std::cerr << "bvlc_googlenet.caffemodel can be downloaded here:" << std::endl;
		std::cerr << "http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel" << std::endl;
		exit(-1);
	}

5、读取图片并将其转换成GoogleNet可以读取的blob:

	cv::Mat img = cv::imread(imageFile, cv::IMREAD_COLOR);
	if (img.empty())
	{
		std::cerr << "Can't read image from the file: " << imageFile << std::endl;
		exit(-1);
	}

	cv::resize(img, img, cv::Size(224, 224));        
	cv::dnn::Blob inputBlob = cv::dnn::Blob(img);   //Convert Mat to dnn::Blob image batch  


6、将blob传递给网络:

	net.setBlob(".data", inputBlob);        //set the network input  

7、前向传递:

	net.forward();                          //compute output  

8、分类:

	getMaxClass(prob, &classId, &classProb);//find the best class  

9、打印分类结果:

	std::vector<cv::String> classNames = readClassNames();
	std::cout << "Best class: #" << classId << " '" << classNames.at(classId) << "'" << std::endl;
	std::cout << "Probability: " << classProb * 100 << "%" << std::endl;

运行,报错如下:



找了很久,终于在参考[3]中找到了解决方案,原因是这里将图像数据转换成blob的方法来自于老版本,在新版本中不兼容。解决方法如下:cv::dnn::Blob(img) 用cv::dnn::Blob::fromImages(img)替换掉。


修改后,再运行,结果如下:



参考:

[1] http://docs.opencv.org/trunk/d5/de7/tutorial_dnn_googlenet.html
[2] http://blog.youkuaiyun.com/langb2014/article/details/50555910
[3] https://github.com/opencv/opencv_contrib/issues/749


-----------------------------------------

2017.07.24


### 使用 Caffe 框架进行面部表情识别 要使用 Caffe 框架在 Python 中实现面部表情识别,可以按照以下方法构建模型并完成训练和预测过程。以下是详细的说明: #### 1. 安装依赖项 为了运行 Caffe 和其 Python 接口,需要安装必要的库和工具链。可以通过以下命令安装 Caffe 的 Python 绑定以及其他所需的库: ```bash pip install caffe-python opencv-python-headless numpy scikit-learn pandas matplotlib ``` 如果尚未编译支持 GPU 或 CPU 版本的 Caffe,请参考官方文档[^3]。 --- #### 2. 准备数据集 通常用于面部表情分类的数据集有 FER2013、CK+ 等。这些数据集中包含了标注好的人脸图像及其对应的表情类别(如愤怒、厌恶、恐惧、快乐、悲伤、惊讶和平静)。 加载数据集时可借助 Pandas 库来解析 CSV 文件中的像素值,并将其转换为 NumPy 数组形式以便后续处理[^4]。 ```python import pandas as pd import numpy as np def load_data(path): data = pd.read_csv(path) pixels = data['pixels'].tolist() faces = [] for pixel_sequence in pixels: face = [int(pixel) for pixel in pixel_sequence.split(' ')] face = np.asarray(face).reshape(48, 48) faces.append(face.astype('float32')) faces = np.asarray(faces) faces /= 255.0 # 归一化至 [0, 1] labels = pd.get_dummies(data['emotion']).values return faces, labels faces_train, labels_train = load_data('./fer2013/fer2013_training.csv') faces_test, labels_test = load_data('./fer2013/fer2013_publictest.csv') print("Training set shape:", faces_train.shape, labels_train.shape) print("Testing set shape:", faces_test.shape, labels_test.shape) ``` --- #### 3. 构建网络结构 (Prototxt 文件定义) Caffe 需要在 `.prototxt` 文件中明确定义卷积神经网络架构。下面是一个简单的 CNN 结构示例,适用于面部表情识别任务: ```protobuf name: "FacialExpressionCNN" layer { name: "data" type: "Input" top: "data" input_param { shape: { dim: 64 dim: 1 dim: 48 dim: 48 } } } layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" convolution_param { num_output: 32 kernel_size: 3 stride: 1 pad: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } ... # 更多层省略 ``` 完整的 prototxt 文件应包含输入层、多个卷积层、池化层、全连接层以及损失函数等组件[^5]。 --- #### 4. 训练配置文件 (Solver Prototxt) 创建 solver 文件指定超参数设置,例如迭代次数、学习率策略等。 ```protobuf net: "./models/facial_expression_cnn/train_val.prototxt" test_iter: 100 test_interval: 500 base_lr: 0.001 lr_policy: "step" gamma: 0.1 stepsize: 20000 display: 20 max_iter: 100000 momentum: 0.9 weight_decay: 0.0005 snapshot: 5000 snapshot_prefix: "./snapshots/facial_expression_model" solver_mode: GPU ``` --- #### 5. 调用 Caffe API 进行训练 通过调用 `caffe.Net()` 加载网络原型与权重初始化;再利用 `caffe.SGDSolver()` 执行优化算法完成端到端的学习流程。 ```python from caffe import Net, SGDSolver train_net_path = './models/facial_expression_cnn/train_val.prototxt' pretrained_weights = None # 如果存在预训练模型路径则填写此处 solver_config_path = './solvers/solver.prototxt' if pretrained_weights is not None and os.path.exists(pretrained_weights): net = Net(train_net_path, weights=pretrained_weights, phase='TRAIN') else: net = Net(train_net_path, phase='TRAIN') solver = SGDSolver(solver_config_path) for _ in range(max_iterations): # 替换 max_iterations 值为你设定的最大轮次 solver.step(1) ``` --- #### 6. 测试阶段推理功能 当模型收敛后保存最终版本权值文件,在实际应用环境中读取该模型并对新样本做前向传播操作得出预测结果。 ```python import cv2 model_def = './deploy/deploy.prototxt' # deploy模式下的网络描述文件 model_weights = './snapshots/facial_expression_model_100k.caffemodel' net = Net(model_def, model_weights, phase='TEST') def preprocess_image(image): gray_img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) resized_face = cv2.resize(gray_img, (48, 48)) normalized_face = resized_face / 255.0 reshaped_input = normalized_face.reshape((1, 1, 48, 48)).astype(np.float32) return reshaped_input image_to_predict = cv2.imread("./sample_faces/sample.jpg") input_blob = preprocess_image(image_to_predict) output_probs = net.forward_all(**{'data': input_blob})["prob"] predicted_class_index = output_probs.argmax(axis=-1)[0] emotions = ["Angry", "Disgust", "Fear", "Happy", "Sad", "Surprise", "Neutral"] print("Predicted emotion:", emotions[predicted_class_index]) ``` ---
评论 11
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值