GraphicsMagick 的 OpenCL 开发记录(三十六)

本文详细描述了如何使用OpenCL在硬件上优化ScaleImage函数进行图像缩放,包括Y轴处理和工作流程。然而,作者提到在将代码从Linux移植到Windows时遇到问题,涉及到跨平台兼容性挑战和输出显示错误。

<2022-05-05 周四>

如何写ScaleImage()的硬件加速函数(十)

难道就这么被我轻松实现了?

如何写ScaleImage()的硬件加速函数(九)”是在“如何写ScaleImage()的硬件加速函数(八)”的基础上处理了图片放大变亮的问题,但是他们都只是X方向的处理,没有实现原始函数ScaleImage()Y方向缩放。

目前先处理Y方向再处理X方向的代码都有了,如下:

static MagickBooleanType scaleFilter(MagickCLDevice device,
  cl_command_queue queue, const Image* image, Image* filteredImage,
  cl_mem imageBuffer, cl_uint matte_or_cmyk, cl_uint columns, cl_uint rows,
  cl_mem scaledImageBuffer, cl_uint scaledColumns, cl_uint scaledRows,
  ExceptionInfo* exception)
{
   
   
  cl_kernel
    scaleKernel;

  cl_int
    status;

  const unsigned int
    workgroupSize = 256;

  float
    scale;

  int
    numCachedPixels;

  MagickBooleanType
    outputReady;

  size_t
    gammaAccumulatorLocalMemorySize,
    gsize[2],
    i,
    imageCacheLocalMemorySize,
    pixelAccumulatorLocalMemorySize,
    pixelAccumulatorLocalMemorySize2,
    lsize[2],
    totalLocalMemorySize,
    weightAccumulatorLocalMemorySize;

  unsigned int
    chunkSize,
    pixelPerWorkgroup;

  scaleKernel = NULL;
  outputReady = MagickFalse;

  scale = (float)scaledColumns / columns; // TODO(ocl)

  unsigned int stop = 0;
  unsigned int next_row = 1;
  float y_span = 1.0;
  float y_scale = (float)scaledRows / rows;
  if (scaledRows == rows)
    stop++;
  else {
   
   
    while (y_scale < y_span) {
   
   
      if (next_row) {
   
   
        stop++;
      }
      y_span -= y_scale;
      y_scale = (float)scaledRows / rows;
      next_row = 1;
    }

    if (next_row) {
   
   
      stop++;
      next_row = 0;
    }
  }

  if (scaledColumns < workgroupSize)
  {
   
   
    chunkSize = 32;
    pixelPerWorkgroup = 32;
  }
  else
  {
   
   
    chunkSize = workgroupSize;
    pixelPerWorkgroup = workgroupSize;
  }

  DisableMSCWarning(4127)
    while (1)
      RestoreMSCWarning
    {
   
   
      /* calculate the local memory size needed per workgroup */
      numCachedPixels=(int) ceil((pixelPerWorkgroup-1)/scale+2*(0.5+MagickEpsilon)); // TODO(ocl)
      imageCacheLocalMemorySize = numCachedPixels * sizeof(CLQuantum) * 4 * stop;
      totalLocalMemorySize = imageCacheLocalMemorySize;

      /* local size for the pixel accumulator */
      pixelAccumulatorLocalMemorySize = chunkSize * sizeof(cl_float4);
      totalLocalMemorySize += pixelAccumulatorLocalMemorySize;

      pixelAccumulatorLocalMemorySize2 = pixelAccumulatorLocalMemorySize;
      totalLocalMemorySize += pixelAccumulatorLocalMemorySize2;

      /* local memory size for the weight accumulator */
      weightAccumulatorLocalMemorySize = chunkSize * sizeof(float);
      totalLocalMemorySize += weightAccumulatorLocalMemorySize;

      /* local memory size for the gamma accumulator */
      gammaAccumulatorLocalMemorySize = chunkSize * sizeof(float);
      totalLocalMemorySize += gammaAccumulatorLocalMemorySize;

      if (totalLocalMemorySize <= device->local_memory_size)
        break;
      else
      {
   
   
        pixelPerWorkgroup = pixelPerWorkgroup / 2;
        chunkSize = chunkSize / 2;
        if ((pixelPerWorkgroup == 0) || (chunkSize == 0))
        {
   
   
          /* quit, fallback to CPU */
          goto cleanup;
        }
      }
    }

  scaleKernel = AcquireOpenCLKernel(device, "ScaleFilter");
  if (scaleKernel == (cl_kernel)NULL)
  {
   
   
    (void)OpenCLThrowMagickException(device, exception, GetMagickModule(),
      ResourceLimitWarning, "AcquireOpenCLKernel failed.", ".");
    goto cleanup;
  }

  i = 0;
  status = SetOpenCLKernelArg(scaleKernel, i++, sizeof(cl_mem), (void*)&imageBuffer);
  status |= SetOpenCLKernelArg(scaleKernel, i++, sizeof(cl_uint
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值