并行编程实战——TBB框架的应用之四Supra对CUDA的支持

最新推荐文章于 2025-06-12 11:06:18 发布

fpcc

最新推荐文章于 2025-06-12 11:06:18 发布

阅读量432

点赞数 5

CC 4.0 BY-SA版权

分类专栏：并行编程 C++ 文章标签： c++

本文链接：https://blog.youkuaiyun.com/fpcc/article/details/143658246

一、CUDA和OPENCL

从TBB旧的版本到OneAPI的新的框架，都对并行计算最大可能的进行了支持。这其中就包括CUDA和OPENGL，这两个框架对大多数做开发的人来说，可能听到的较多，但真正用的并不多。反倒是在AI应用中，因为涉及到图像的处理，大多使用GPU来进行，所以应用比较广泛。
这里不打算对二者进行详细的说明，有兴趣的可以自行查看相关资料。大概简单说一下，二者做为异构平台并行计算的框架，CUDA是特定平台（NVIDIA）的而OPENCL是类似于一个标准存在的，理论上是可以适应各种平台的。

二、TBB中的应用

在OneAPI当然对二者也进行了支持，毕竟并行框架里不支持更多的并行框架，简直是不要不要的。都是为了性能，如果能迭加产生1+1>2的效果得有多好。这在Supra中有所体现，下面看一下相关的代码：
1、使用CUDA

#include "ImageProcessingCuda.h"

#include <thrust/transform.h>
#include <thrust/execution_policy.h>

using namespace std;

namespace supra
{
   
   
	namespace ImageProcessingCudaInternal
	{
   
   
		typedef ImageProcessingCuda::WorkType WorkType;

		// here the actual processing happens!

		template <typename InputType, typename OutputType>
		__global__ void processKernel(const InputType* inputImage, vec3s size, WorkType factor, OutputType* outputImage)
		{
   
   
			size_t x = blockDim.x*blockIdx.x + threadIdx.x;
			size_t y = blockDim.y*blockIdx.y + threadIdx.y;
			size_t z = blockDim.z*blockIdx.z + threadIdx.z;

			size_t width = size.x;
			size_t height = size.y;
			size_t depth = size.z;

			if (x < width && y < height && z < depth)
			{
   
   
				// Perform a pixel-wise operation on the image

				// Get the input pixel value and cast it to out working type.
				// As this should in general be a type with wider range / precision, this cast does not loose anything.
				WorkType inPixel = inputImage[x + y*width + z *width*height];

				// Perform operation, in this case multiplication
				WorkType value = inPixel * factor;

				// Store the output pixel value.
				// Because this is templated, we need to cast from "WorkType" to "OutputType".
				// This should happen in a sane way, that is with clamping. There is a helper for that!
				outputImage[x + y*width + z *width*height] = clampCast<OutputType>(value);
			}
		}
	}

	template