<2022-03-15 Tue>
调用clCreateBuffer()产生异常问题(五)
在前一篇“GraphicsMagick 的 OpenCL 开发记录(六)”的基础上继续分析,因为clCreateBuffer()返回的地址即GetAuthenticOpenCLBuffer()的返回值(它在ComputeResizeImage()函数中被调用),当ComputeResizeImage()结束时,调用clReleaseMemObject()将会减少该内存计数,当计数为0时clCreateBuffer()创建的内存才被释放。
为了打印内存的引用计数,增加了clGetMemObjectInfo()函数:
// opencl-private.h
typedef CL_API_ENTRY cl_int
(CL_API_CALL *MAGICKpfn_clGetMemObjectInfo)(cl_mem memobj,
cl_mem_info param_name,size_t param_value_size,void *param_value,
size_t *param_value_size_ret)
CL_API_SUFFIX__VERSION_1_0;
MAGICKpfn_clGetMemObjectInfo clGetMemObjectInfo;
比如ReleaseOpenCLMemObject()函数被改成了:
MagickPrivate void ReleaseOpenCLMemObject(cl_mem memobj)
{
cl_uint refcnt=0;
openCL_library->clGetMemObjectInfo(memobj, CL_MEM_REFERENCE_COUNT, sizeof(cl_uint), &refcnt, NULL);
LogMagickEvent(UserEvent, GetMagickModule(),
"b4 ReleaseOpenCLMemObject(%p) refcnt: %d", memobj, refcnt);
cl_int ret=openCL_library->clReleaseMemObject(memobj);
openCL_library->clGetMemObjectInfo(memobj, CL_MEM_REFERENCE_COUNT, sizeof(cl_uint), &refcnt, NULL);
LogMagickEvent(UserEvent, GetMagickModule(),
"af ReleaseOpenCLMemObject(%p) refcnt: %d", memobj, refcnt);
}
我有输出如下:
[ysouyno@arch gm-ocl]$ gm display ~/temp/bg1a.jpg
13:51:33 0:1.510679 1.480u 47963 opencl.c/AcquireMagickCLCacheInfo/576/User:
clCreateBuffer -- req: 1, pixles: 0x5588e97712e0, len: 15728640
13:51:33 0:1.522527 1.570u 47963 opencl.c/AcquireMagickCLCacheInfo/584/User:
clCreateBuffer return: 0x5588e774f5c0, refcnt: 1
13:51:33 0:1.522589 1.570u 47963 opencl.c/AcquireMagickCLCacheInfo/576/User:
clCreateBuffer -- req: 2, pixles: 0x5588e88fc130, len: 1536000
13:51:33 0:1.524313 1.590u 47963 opencl.c/AcquireMagickCLCacheInfo/584/User:
clCreateBuffer return: 0x5588e77064b0, refcnt: 1
13:51:33 0:1.528273 1.610u 47963 opencl.c/ReleaseOpenCLMemObject/509/User:
b4 ReleaseOpenCLMemObject(0x5588e774f5c0) refcnt: 2
13:51:33 0:1.528351 1.610u 47963 opencl.c/ReleaseOpenCLMemObject/513/User:
af ReleaseOpenCLMemObject(0x5588e774f5c0) refcnt: 1
13:51:33 0:1.528410 1.610u 47963 opencl.c/ReleaseOpenCLMemObject/509/User:
b4 ReleaseOpenCLMemObject(0x5588e77064b0) refcnt: 2
13:51:33 0:1.528465 1.610u 47963 opencl.c/ReleaseOpenCLMemObject/513/User:
af ReleaseOpenCLMemObject(0x5588e77064b0) refcnt: 1
13:51:33 0:1.528520 1.610u 47963 opencl.c/ReleaseOpenCLMemObject/509/User:
b4 ReleaseOpenCLMemObject(0x5588e87fead0) refcnt: 1
13:51:33 0:1.528582 1.610u 47963 opencl.c/ReleaseOpenCLMemObject/513/User:
af ReleaseOpenCLMemObject(0x5588e87fead0) refcnt: 1
13:51:34 0:1.740225 2.900u 47963 opencl.c/AcquireMagickCLCacheInfo/576/User:
clCreateBuffer -- req: 2, pixles: 0x5588eb571300, len: 15728640
13:51:34 0:1.742399 2.910u 47963 opencl.c/AcquireMagickCLCacheInfo/584/User:
clCreateBuffer return: 0x5588e87fead0, refcnt: 1
13:51:34 0:1.742615 2.910u 47963 opencl.c/AcquireMagickCLCacheInfo/576/User:
clCreateBuffer -- req: 3, pixles: 0x5588e9a841e0, len: 1536000
13:51:34 0:1.744057 2.930u 47963 opencl.c/AcquireMagickCLCacheInfo/584/User:
clCreateBuffer return: 0x5588e87feda0, refcnt: 1
13:51:34 0:1.745895 2.930u 47963 opencl.c/ReleaseOpenCLMemObject/509/User:
b4 ReleaseOpenCLMemObject(0x5588e87fead0) refcnt: 2
13:51:34 0:1.745964 2.930u 47963 opencl.c/ReleaseOpenCLMemObject/513/User:
af ReleaseOpenCLMemObject(0x5588e87fead0) refcnt: 1
13:51:34 0:1.746004 2.930u 47963 opencl.c/ReleaseOpenCLMemObject/509/User:
b4 ReleaseOpenCLMemObject(0x5588e87feda0) refcnt: 2
13:51:34 0:1.746081 2.930u 47963 opencl.c/ReleaseOpenCLMemObject/513/User:
af ReleaseOpenCLMemObject(0x5588e87feda0) refcnt: 1
13:51:34 0:1.746126 2.930u 47963 opencl.c/ReleaseOpenCLMemObject/509/User:
b4 ReleaseOpenCLMemObject(0x5588e87756a0) refcnt: 1
13:51:34 0:1.746185 2.930u 47963 opencl.c/ReleaseOpenCLMemObject/513/User:
af ReleaseOpenCLMemObject(0x5588e87756a0) refcnt: 1
13:51:34 0:1.946106 4.270u 47963 opencl.c/AcquireMagickCLCacheInfo/576/User:
clCreateBuffer -- req: 3, pixles: 0x5588ea9841f0, len: 15728640
Abort was called at 250 line in file:
/build/intel-compute-runtime/src/compute-runtime-22.09.22577/shared/source/memory_manager/host_ptr_manager.cpp
Aborted (core dumped)
测试地址重叠问题:
(> (+ #x5588ea9841f0 15728640) #x5588eb571300)
(- #x5588eb571300 #x5588ea9841f0)
解释下:最后一行0x5588ea9841f0调用clCreateBuffer()时崩溃,它的地址与0x5588eb571300重叠,而0x5588eb571300申请的cl_mem地址为:0x5588e87fead0,最后一次调用ReleaseOpenCLMemObject()后它的引用计数为1,这说明0x5588eb571300还没被释放而0x5588ea9841f0又开始申请造成内存重叠。
发现一处问题,上面输出中有如下:
13:51:33 0:1.528520 1.610u 47963 opencl.c/ReleaseOpenCLMemObject/509/User:
b4 ReleaseOpenCLMemObject(0x5588e87fead0) refcnt: 1
13:51:33 0:1.528582 1.610u 47963 opencl.c/ReleaseOpenCLMemObject/513/User:
af ReleaseOpenCLMemObject(0x5588e87fead0) refcnt: 1
ReleaseOpenCLMemObject(0x5588e87fead0)调用前后引用计数没有减少。难道clReleaseMemObject()调用失败了?
原来是因为当对象已经销毁后再调用clGetMemObjectInfo()将会返回-38的错误,即CL_INVALID_MEM_OBJECT。
文章讲述了在使用OpenCL进行开发时遇到的问题,尤其是在调用clCreateBuffer创建内存后,释放内存时引用计数出现问题,导致内存重叠。作者通过添加clGetMemObjectInfo函数观察内存引用计数,发现可能因对象已销毁时调用clReleaseMemObject失败,导致CL_INVALID_MEM_OBJECT错误。
472

被折叠的 条评论
为什么被折叠?



