tensorflow c++ API预测多张图片batch inference

探讨TensorFlow C++中单张与批量图像预测性能,发现GPU环境下批量预测显著加速,对比分析图像分类与分割任务中批量预测的差异。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

完整代码放在:batch-inference.cpp-其它文档类资源-优快云下载

经过前几篇的折腾与各种查资料后,单张预测代码:

//one image prediction ---single image
int mainsingle()
{
	  Session* session;
	  Status status = NewSession(SessionOptions(), &session);

	  const std::string graph_fn = "/media/root/Ubuntu311/projects/Ecology_projects/JPMVCNN_AlgaeAnalysisMathTestDemo/model-0723/model.meta";
	  MetaGraphDef graphdef;
	  Status status_load = ReadBinaryProto(Env::Default(), graph_fn, &graphdef); //从meta文件中读取图模型;
	  if (!status_load.ok()) {
	        std::cout << "ERROR: Loading model failed..." << graph_fn << std::endl;
	        std::cout << status_load.ToString() << "\n";
	        return -1;
	  }

	  Status status_create = session->Create(graphdef.graph_def()); //将模型导入会话Session中;
	  if (!status_create.ok()) {
	        std::cout << "ERROR: Creating graph in session failed..." << status_create.ToString() << std::endl;
	        return -1;
	  }
	  cout << "Session successfully created.Load model successfully!"<< endl;

	  // 读入预先训练好的模型的权重
	  const std::string checkpointPath = "/media/root/Ubuntu311/projects/Ecology_projects/JPMVCNN_AlgaeAnalysisMathTestDemo/model-0723/model";
	  Tensor checkpointPathTensor(DT_STRING, TensorShape());
	  checkpointPathTensor.scalar<std::string>()() = checkpointPath;
	  status = session->Run(
			  {{ graphdef.saver_def().filename_tensor_name(), checkpointPathTensor },},
			  {},{graphdef.saver_def().restore_op_name()},nullptr);
	  if (!status.ok())
	  {
		  throw runtime_error("Error loading checkpoint from " + checkpointPath + ": " + status.ToString());
	  }
	  cout << "Load weights successfully!"<< endl;


	  //read image for prediction...
	  char srcfile[200];
	  double alltime=0.0;
	  for(int numingroup=0;numingroup<1326;numingroup++)
	  {
		  sprintf(srcfile, "/media/root/Ubuntu311/projects/Ecology_projects/copy/cnn-imgs96224/%d.JPG",numingroup);
		  cv::Mat srcimg=cv::imread(srcfile,0);
		  if(!srcimg.data)
		  {
			  continue;
		  }

		  Tensor resized_tensor(DT_FLOAT, TensorShape({1,96,224,1}));
		  float *imgdata = resized_tensor.flat<float>().data();
		  cv::Mat cameraImg(96, 224, CV_32FC1, imgdata);
		  srcimg.convertTo(cameraImg, CV_32FC1);
		  //对图像做预处理
		  cameraImg=cameraImg/255;
		  std::cout <<"Read image successfully: "<< resized_tensor.DebugString()<<endl;

		   vector<std::pair<string, Tensor> > inputs;
		   std::string Input1Name = "input";
		   inputs.push_back(std::make_pair(Input1Name, resized_tensor));
		   Tensor is_training_val(DT_BOOL,TensorShape());
		   is_training_val.scalar<bool>()()=false;
		   std::string Input2Name = "is_training";
		   inputs.push_back(std::make_pair(Input2Name, is_training_val));

		   vector<tensorflow::Tensor> outputs;
		   string output="output";

		   cv::TickMeter timer;
		   timer.start();
		   Status status_run = session->Run(inputs, {output}, {}, &outputs);
		   if (!status_run.ok()) {
			   std::cout << "ERROR: RUN failed..."  << std::endl;
			   std::cout << status_run.ToString() << "\n";
			   return -1;
		   }

		   timer.stop();
		   cout<<"single image inference time is: "<<timer.getTimeSec()<<" s."<<endl;
		   alltime+=(timer.getTimeSec());
	       timer.reset();

		  Tensor t = outputs[0];
		  int ndim2 = t.shape().dims();
		  auto tmap = t.tensor<float, 2>();  // Tensor Shape: [batch_size, target_class_num]
		  int output_dim = t.shape().dim_size(1);
		  std::vector<double> tout;

		  // Argmax: Get Final Prediction Label and Probability
		  int output_class_id = -1;
		  double output_prob = 0.0;
		  for (int j = 0; j < output_dim; j++)
		  {
				std::cout << "Class " << j << " prob:" << tmap(0, j) << "," << std::endl;
				if (tmap(0, j) >= output_prob) {
						output_class_id = j;
						output_prob = tmap(0, j);
				}
		  }
		  std::cout << "Final class id: " << output_class_id << std::endl;
		  std::cout << "Final class prob: " << output_prob << std::endl;
	  }

	  cout<<"all image have been predicted and time is: "<<alltime<<endl;

	return 0;
}

我测了下预测时间每张图几乎0.02秒:

下面是分成多个batch进行预测:

//batch inference...
int mainbatchinference()
{
	  Session* session;
	  Status status = NewSession(SessionOptions(), &session);

	  const std::string graph_fn = "/media/root/Ubuntu311/projects/Ecology_projects/JPMVCNN_AlgaeAnalysisMathTestDemo/model-0723/model.meta";
	  MetaGraphDef graphdef;
	  Status status_load = ReadBinaryProto(Env::Default(), graph_fn, &graphdef); //从meta文件中读取图模型;
	  if (!status_load.ok()) {
	        std::cout << "ERROR: Loading model failed..." << graph_fn << std::endl;
	        std::cout << status_load.ToString() << "\n";
	        return -1;
	  }

	  Status status_create = session->Create(graphdef.graph_def()); //将模型导入会话Session中;
	  if (!status_create.ok()) {
	        std::cout << "ERROR: Creating graph in session failed..." << status_create.ToString() << std::endl;
	        return -1;
	  }
	  cout << "Session successfully created.Load model successfully!"<< endl;

	  // 读入预先训练好的模型的权重
	  const std::string checkpointPath = "/media/root/Ubuntu311/projects/Ecology_projects/JPMVCNN_AlgaeAnalysisMathTestDemo/model-0723/model";
	  Tensor checkpointPathTensor(DT_STRING, TensorShape());
	  checkpointPathTensor.scalar<std::string>()() = checkpointPath;
	  status = session->Run(
			  {{ graphdef.saver_def().filename_tensor_name(), checkpointPathTensor },},
			  {},{graphdef.saver_def().restore_op_name()},nullptr);
	  if (!status.ok())
	  {
		  throw runtime_error("Error loading checkpoint from " + checkpointPath + ": " + status.ToString());
	  }
	  cout << "Load weights successfully!"<< endl;


	  int cnnrows=96;
	  int cnncols=224;
	  //read image for prediction...
	  char srcfile[200];
	  const int imgnum=1326;

	  const int batch=32;
	  double alltime=0.0;
	  //all image inference...
	  for(int imgind=0;imgind<imgnum/batch;imgind++)
	  {
		  //a batch inference...
		  tensorflow::Tensor input_tensor(tensorflow::DT_FLOAT, tensorflow::TensorShape({ batch, cnnrows, cnncols, 1 }));
		  auto input_tensor_mapped = input_tensor.tensor<float, 4>();

		  int batchind=0;
		  int imgrealind=imgind*batch;
		  for(;batchind!=batch;batchind++)
		  {
			  sprintf(srcfile, "/media/root/Ubuntu311/projects/Ecology_projects/copy/cnn-imgs96224/%d.JPG",imgrealind);
			  cv::Mat srcimg=cv::imread(srcfile,0);
			  if(!srcimg.data)
			  {
				  continue;
			  }
			  cv::Mat cameraImg(cnnrows, cnncols, CV_32FC1);
			  srcimg.convertTo(cameraImg, CV_32FC1);
			  cameraImg=cameraImg/255;

			  //convert batch cv image to tensor
		      for (int y = 0; y < cnnrows; ++y)
		      {
				  const float* source_row = (float*)cameraImg.data + (y * cnncols);
				  for (int x = 0; x < cnncols; ++x)
				  {
						const float* source_pixel = source_row + x;
						input_tensor_mapped(batchind, y, x, 0) = *source_pixel;
				  }
		      }
		      imgrealind++;
		  //a batch image transfer done...
		  }

		  vector<std::pair<string, Tensor> > inputs;
		  std::string Input1Name = "input";
		  inputs.push_back(std::make_pair(Input1Name, input_tensor));
		  Tensor is_training_val(DT_BOOL,TensorShape());
		  is_training_val.scalar<bool>()()=false;
		  std::string Input2Name = "is_training";
		  inputs.push_back(std::make_pair(Input2Name, is_training_val));

		  vector<tensorflow::Tensor> outputs;
		  string output="output";
		  cv::TickMeter timer;
	      timer.start();
	      Status status_run = session->Run(inputs, {output}, {}, &outputs);
	      if (!status_run.ok()) {
		   std::cout << "ERROR: RUN failed..."  << std::endl;
		   std::cout << status_run.ToString() << "\n";
		   return -1;
	      }

	      timer.stop();
	      cout<<"time of this batch inference is: "<<timer.getTimeSec()<<" s."<<endl;
	      alltime+=(timer.getTimeSec());
	      timer.reset();

	      auto finalOutputTensor  = outputs[0].tensor<float, 2>();
	      int output_dim = outputs[0].shape().dim_size(1);
	      for(int b=0; b<batch;b++)
	      {
			  for(int i=0; i<output_dim; i++)
			  {
				  cout << b << "the probability for class "<<i<<" is "<< finalOutputTensor(b, i) <<endl;
			  }
	      }
	  //all images inference done...
	  }

	  cout<<"all image have been predicted and time is: "<<alltime<<endl;

	return 0;
}

batch inference的时间是:

已对比测试过,多张预测batch inference与single image inference预测结果一致,证明代码正确。

但是之前stackoverflow上有人说batch inference比single image inference快,所以我才尝试batch inference的,但是我测出来并不快!!!

他说他single inference是0.02秒,batch=1560的inference只要0.03秒,提速了1560X0.02/0.03=几乎1000倍!!!但是我这里并没有什么提速的效果,(batchsize不能随便设置要与training时的batch一样:)

关于预测时间这个问题我已在 performance - batch inference is as slow as single image inference in tensorflow c++ - Stack Overflow 和 https://github.com/tensorflow/tensorflow/issues/31572 上提问了,目前没有有效答复。

另外谷歌上关于tensorflow C++预测时间的帖子我都看了,目前还是没有找到提速的方法。

Time Cost of TF in C++ — Fast Depth Coding Using Deep Learning 0.1.0 documentation 这个人的效果就很好 ,不知道他说的Running all samples in one session是不是我这样,我也是只用了一个session。然而我并没有看到速度很快。他加了优化后也有很大提速,然而我这里还是并没有。

网上也有很多说tensorflow C++预测慢的 https://github.com/tensorflow/tensorflow/issues/10669 

更新:上面的问题已解决,batch inference和single inference在CPU下差不多,但GPU下会有很大提速,我按https://blog.youkuaiyun.com/wd1603926823/article/details/102869208编译了TF2.5 GPU C++后,测试如下所示左边是single inference耗时约4ms,右边是batchsize=24的耗时约47ms,可以看到batch inference提速了50% !!!

参考,不过这几个都是single reference的,好像TF C++ batch inference的资料很少:Ubuntu上运行tensorflow C++的完整例子 - 熊叫大雄 - 博客园

在 C/C++ 中使用 TensorFlow 预训练好的模型—— 直接调用 C++ 接口实现 - seniusen - 博客园

Tensorflow② c++接口加载模型推理单张图片_sooner高的博客-优快云博客

tesorflow c++ gpu 调用 maskrcnn_穿云去的博客-优快云博客

就想问下为何图像分类/目标检测中batch inference没问题,将batch幅图传入tensor,然后像我上面例子中那样可以得到batch各输出。可是图像分割中却不能同样操作??!我这里也只是图像分割single inference:tensorflow C++图像分割/目标检测从tensor输出图像_元气少女缘结神的博客-优快云博客无论国内外,神经网络做图像分割的人几乎没有用tensorflow c++ batch inference去做的,所以大家也就不知道为何分割时传入batch张图,可是输出却只有第一张图的分割结果??!!为何?

 我查过,对比过,不是这些原因导致。

 也不是这个原因导致,若以后我找到了答案会更新在这里:tensorflow C++图像分割/目标检测从tensor输出图像_元气少女缘结神的博客-优快云博客

https://github.com/tensorflow/tensorflow/issues/19909

依旧放一张小不点的照片镇楼

评论 49
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

元气少女缘结神

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值