Appendix H: OpenCL 3.0 Backwards Compatibility
附录H:OpenCL 3.0向后兼容性
OpenCL 3.0 breaks backwards compatibility with earlier versions of OpenCL by making some features that were previously required for FULL_PROFILE or EMBEDDED_PROFILE devices optional. This appendix describes the features that were previously required that are now optional, how to detect whether an optional feature is supported, and expected behavior when an optional feature is not supported.
OpenCL 3.0通过使FULL_PROFILE或EMBEDDED_PROFILE设备以前所需的一些功能成为可选功能,打破了与早期版本的OpenCL的向后兼容性。本附录描述了以前需要但现在是可选的功能,如何检测是否支持可选功能,以及不支持可选功能时的预期行为。
Informally, in the tables below the first row usually describes a feature detection mechanism ("May return this value indicating that the feature is not supported") and subsequent rows usually describe behavior when a feature is not supported ("Returns this value if the feature is not supported"). 非正式地说,在下表中,第一行通常描述了一种特征检测机制(“可能返回此值,表示不支持该特征”),后续行通常描述不支持某个特征时的行为(“如果不支持该功能,则返回此值”)。 |
Shared Virtual Memory
共享虚拟内存
Shared Virtual Memory (SVM) is optional for devices supporting OpenCL 3.0. When Shared Virtual Memory is not supported:
共享虚拟内存(SVM)对于支持OpenCL 3.0的设备是可选的。当不支持共享虚拟内存时:
API 接口 | Behavior 行为 |
---|---|
May return 可能返回0,表示设备不支持共享虚拟内存。 | |
Returns CL_FALSE if no devices in the context associated with memobj support Shared Virtual Memory. 如果与memobj关联的上下文中没有设备支持共享虚拟内存,则返回CL_FALSE。 | |
Returns 如果上下文中没有支持共享虚拟内存的设备,则返回NULL。 | |
Is a NOP if no devices in context support Shared Virtual Memory. 如果上下文中没有设备支持共享虚拟内存,则为NOP。 | |
clEnqueueSVMFree, | Returns CL_INVALID_OPERATION if the device associated with command_queue does not support Shared Virtual Memory. 如果与command_queue关联的设备不支持共享虚拟内存,则返回CL_INVALID_OPERATION。 |
Returns CL_INVALID_OPERATION if no devices in the context associated with kernel support Shared Virtual Memory. 如果与kernel 关联的上下文中没有设备支持共享虚拟内存,则返回CL_INVALID_OPERATION。 |
Memory Consistency Model
内存一致性模型
Some aspects of the OpenCL memory consistency model are optional for devices supporting OpenCL 3.0. New device queries were added to clGetDeviceInfo to allow capabilities to be precisely reported. When the full memory consistency model is not supported:
OpenCL内存一致性模型的某些方面对于支持OpenCL 3.0的设备是可选的。clGetDeviceInfo中添加了新的设备查询,以允许精确报告功能。当不支持全内存一致性模型时:
API 接口 | Behavior 行为 |
---|---|
clGetDeviceInfo, passing | May return: 可能返回: CL_DEVICE_ATOMIC_ORDER_RELAXED | indicating that device does not support the full memory consistency model for atomic memory operations. 表明device 不支持原子内存操作的完整内存一致性模型。 Note that a device that provides the same level of capabilities as an OpenCL 2.x device would be expected to return: 请注意,提供与OpenCL 2.x设备相同级别功能的设备预计会返回: CL_DEVICE_ATOMIC_ORDER_RELAXED | |
clGetDeviceInfo, passing | May return: 可能返回: CL_DEVICE_ATOMIC_ORDER_RELAXED | indicating that device does not support the full memory consistency model for atomic fence operations. 表明设备不支持原子围栏操作的完整内存一致性模型。 Note that a device that provides the same level of capabilities as an OpenCL 2.x device would be expected to return: 请注意,提供与OpenCL 2.x设备相同级别功能的设备预计会返回: CL_DEVICE_ATOMIC_ORDER_RELAXED | |
OpenCL C compilers supporting atomics orders or scopes beyond the mandated minimum will define some or all of following feature macros as appropriate:
支持超出规定最小值的原子顺序或范围的OpenCL C编译器将根据需要定义以下部分或全部功能宏:
-
__opencl_c_atomic_order_acq_rel
— Indicating atomic operations support acquire-release orderings. -
__opencl_c_atomic_order_acq_rel——表示原子操作支持acquire-release顺序。
-
__opencl_c_atomic_order_seq_cst
— Indicating atomic operations and fences support acquire sequentially consistent orderings. -
__opencl_c_atomic_order_seq_cst——表示原子操作和栅栏支持acquire顺序一致的顺序。
-
__opencl_c_atomic_scope_device
— Indicating atomic operations and fences support device-wide memory ordering constraints. -
__opencl_c_atomic_scope_device
——表示原子操作和栅栏支持设备范围的内存排序约束。 -
__opencl_c_atomic_scope_all_devices
— Indicating atomic operations and fences support all-device memory ordering constraints, across any host threads and all devices that can share SVM memory with each other and the host process. -
__opencl_c_atomic_scope_all_devices——表示原子操作和围栏支持所有设备内存排序约束,跨任何主机线程和所有可以相互共享SVM内存和主机进程的设备。
Device-Side Enqueue
设备端Enqueue
Device-side enqueue and on-device queues are optional for devices supporting OpenCL 3.0. When device-side enqueue is not supported:
对于支持OpenCL 3.0的设备,设备侧排队和设备上队列是可选的。当不支持设备侧排队时:
API 接口 | Behavior 行为 |
---|---|
clGetDeviceInfo, passing | May return 可能返回0,表示device 不支持设备侧排队和设备上排队。 |
clGetDeviceInfo, passing | Returns 如果device 不支持设备端排队和设备上队列,则返回0。 |
clGetDeviceInfo, passing | Returns 如果device 不支持设备端排队和设备上队列,则返回0。 |
clGetCommandQueueInfo, passing | Returns CL_INVALID_COMMAND_QUEUE since command_queue cannot be a valid device command-queue. 返回CL_INVALID_COMMAND_QUEUE,因为command_queue 不能是有效的设备命令队列。 |
Returns 如果与command_queue关联的设备不支持设备队列,则返回NULL。 | |
clGetEventProfilingInfo, passing | Returns a value equivalent to passing CL_PROFILING_COMMAND_END if the device associated with event does not support device-side enqueue. 如果与event 关联的设备不支持设备端排队,则返回一个等效于传递CL_PROFILING_COMMAND_END的值。 |
Returns CL_INVALID_OPERATION if device does not support on-device queues. 如果device 不支持设备队列,则返回CL_INVALID_OPERATION。 |
When device-side enqueue is supported but a replaceable default on-device queue is not supported:
当支持设备侧排队但不支持设备队列上的可替换默认值时:
API | Behavior |
---|---|
clGetDeviceInfo, passing | May omit CL_DEVICE_QUEUE_REPLACEABLE_DEFAULT, indicating that device does not support a replaceable default on-device queue. 可以省略CL_DEVICE_QUEUE_REPLACEABLE_DEFAULT,表示device 不支持设备队列上的可替换默认值。 |
Returns CL_INVALID_OPERATION if device does not support a replaceable default on-device queue. |
OpenCL C compilers supporting device-side enqueue and on-device queues will define the feature macro __opencl_c_device_enqueue
. OpenCL C compilers that define the feature macro __opencl_c_device_enqueue
must also define the feature macro __opencl_c_generic_address_space
because some OpenCL C functions for device-side enqueue accept pointers to the generic address space. OpenCL C compilers that define the feature macro __opencl_c_device_enqueue
must also define the feature macro __opencl_c_program_scope_global_variables
because an implementation of blocks may interact with program scope variables in global address space as part of ABI.
支持设备端排队和设备上队列的OpenCL C编译器将定义功能宏__opencl_c_device_enqueue
。定义功能宏__opencl_c_device_enqueue
的OpenCL C编译器还必须定义功能宏 __opencl_c_generic_address_space
,因为设备端排队的一些OpenCL C函数接受指向通用地址空间的指针。定义特征宏__opencl_c_device_enqueue
的OpenCL C编译器还必须定义特征宏__opencl_c_program_scope_global_variables
,因为作为ABI的一部分,块的实现可能会与全局地址空间中的程序作用域变量交互。
Pipes
管道
Pipe memory objects are optional for devices supporting OpenCL 3.0. When pipes are not supported:
对于支持OpenCL 3.0的设备,管道内存对象是可选的。当管道不受支撑时:
API | Behavior |
---|---|
clGetDeviceInfo, passing | May return CL_FALSE, indicating that device does not support pipes. |
clGetDeviceInfo, passing | Returns 如果device 不支持管道,则返回0。 |
Returns CL_INVALID_OPERATION if no devices in context support pipes. 如果context 中没有支持管道的设备,则返回CL_INVALID_OPERATION。 | |
Returns CL_INVALID_MEM_OBJECT since pipe cannot be a valid pipe object. |
OpenCL C compilers supporting pipes will define the feature macro __opencl_c_pipes
. OpenCL C compilers that define the feature macro __opencl_c_pipes
must also define the feature macro __opencl_c_generic_address_space
because some OpenCL C functions for pipes accept pointers to the generic address space.
支持管道的OpenCL C编译器将定义特性宏__opencl_c_pipes
。定义特性宏__opencl_c_pipes
的OpenCL C编译器还必须定义特性宏__opencl_c_generic_address_space
,因为某些用于管道的OpenCL C函数接受指向通用地址空间的指针。
Program Scope Global Variables
程序范围全局变量
Program scope global variables are optional for devices supporting OpenCL 3.0. When program scope global variables are not supported:
对于支持OpenCL 3.0的设备,程序作用域全局变量是可选的。当程序作用域全局变量不受支持时:
API | Behavior |
---|---|
May return 可能返回0,表示device 不支持程序作用域全局变量。 | |
clGetDeviceInfo, passing | Returns 如果device 不支持程序作用域全局变量,则返回0。 |
clGetProgramBuildInfo, passing | Returns 如果device 不支持程序作用域全局变量,则返回0。 |
OpenCL C compilers supporting program scope global variables will define the feature macro __opencl_c_program_scope_global_variables
.
支持程序作用域全局变量的OpenCL C编译器将定义功能宏__opencl_c_program_scope_global_variables
。
Non-Uniform Work-groups
非统一工作组
Support for non-uniform work-groups is optional for devices supporting OpenCL 3.0. When non-uniform work-groups are not supported:
对于支持OpenCL 3.0的设备,支持非统一工作组是可选的。当不支持非统一工作组时:
API | Behavior |
---|---|
clGetDeviceInfo, passing | May return CL_FALSE, indicating that device does not support non-uniform work-groups. |
Behaves as though non-uniform work-groups were not enabled for kernel, if the device associated with command_queue does not support non-uniform work-groups. 如果与command_queue关联的设备不支持非统一工作组,则表现得好像没有为kernel启用非一致工作组。 |
Read-Write Images
读写图像
Read-write images, that may be read from and written to in the same kernel, are optional for devices supporting OpenCL 3.0. When read-write images are not supported:
对于支持OpenCL 3.0的设备,可以在同一内核中读写的读写映像是可选的。当不支持读写图像时:
API | Behavior |
---|---|
clGetDeviceInfo, passing | May return 可能返回0,表示device 不支持读写图像。 |
clGetSupportedImageFormats, passing | Returns an empty set (such as num_image_formats equal to 返回一个空集(例如num_image_formats等于0),表示如果context 中没有设备支持读写图像,则不支持在同一内核中读写图像格式。 |
OpenCL C compilers supporting read-write images will define the feature macro __opencl_c_read_write_images
.
支持读写图像的OpenCL C编译器将定义特征宏__opencl_c_read_write_images
。
Creating 2D Images From Buffers
从缓冲区创建二维图像
Creating a 2D image from a buffer is optional for devices supporting OpenCL 3.0. When creating a 2D image from a buffer is not supported:
对于支持OpenCL 3.0的设备,从缓冲区创建2D图像是可选的。当不支持从缓冲区创建2D图像时:
API | Behavior |
---|---|
clGetDeviceInfo, passing | May return 可能返回0,表示设备不支持从缓冲区创建2D图像。 |
clGetDeviceInfo, passing | Will not describe support for the cl_khr_image2d_from_buffer extension if device does not support creating a 2D image from a buffer. 如果device 不支持从缓冲区创建2D图像,则不会描述对cl_khr_image2d_from_buffer扩展的支持。 |
clCreateImage or | Returns CL_INVALID_OPERATION if no devices in context support creating a 2D image from a buffer. 如果context 中没有设备支持从缓冲区创建2D图像,则返回CL_INVALID_OPERATION。 |
sRGB Images
sRGB图像
All of the sRGB image channel orders (such as CL_sRGBA) are optional for devices supporting OpenCL 3.0. When sRGB images are not supported:
对于支持OpenCL 3.0的设备,所有sRGB图像通道顺序(如CL_sRGBA)都是可选的。当不支持sRGB图像时:
API | Behavior |
---|---|
Will not return any image formats with 如果上下文中没有设备支持sRGB图像,则不会返回任何image_channel_order等于sRGB图像通道顺序的图像格式。 |
Depth Images
深度图像
The CL_DEPTH image channel order is optional for devices supporting OpenCL 3.0. When depth images are not supported:
对于支持OpenCL 3.0的设备,CL_DEPTH图像通道顺序是可选的。当不支持深度图像时:
API | Behavior |
---|---|
Will not return any image formats with |
Device and Host Timer Synchronization
设备和主机定时器同步
Synchronizing the device and host timers is optional for platforms supporting OpenCL 3.0. When device and host timer synchronization is not supported:
对于支持OpenCL 3.0的平台,同步设备和主机定时器是可选的。当不支持设备和主机定时器同步时:
API | Behavior |
---|---|
clGetPlatformInfo, passing | May return 可能返回0,表示platform 不支持设备和主机定时器同步。 |
Returns CL_INVALID_OPERATION if the platform associated with device does not support device and host timer synchronization. |
Intermediate Language Programs
中级语言程序
Creating programs from an intermediate language (such as SPIR-V) is optional for devices supporting OpenCL 3.0. When intermediate language programs are not supported:
对于支持OpenCL 3.0的设备,从中间语言(如SPIR-V)创建程序是可选的。当不支持中间语言程序时:
API | Behavior |
---|---|
clGetDeviceInfo, passing | May return an empty string and empty array, indicating that device does not support intermediate language programs. 可能返回空字符串和空数组,表示设备不支持中间语言程序。 |
clGetProgramInfo, passing | Returns an empty buffer (such as param_value_size_ret equal to 如果与program 关联的上下文中没有设备支持中间语言程序,则返回一个空缓冲区(例如param_value_size_ret等于0)。 |
Returns CL_INVALID_OPERATION if no devices in context support intermediate language programs. | |
Returns CL_INVALID_OPERATION if no devices associated with program support intermediate language programs. | |
clGetKernelSubGroupInfo, passing | Returns 如果device 不支持中间语言程序,则返回0,因为目前无法为从源代码创建的程序要求每个工作组有多个子组。 |
Sub-groups
子组
Sub-groups are optional for devices supporting OpenCL 3.0. When sub-groups are not supported:
对于支持OpenCL 3.0的设备,子组是可选的。当不支持子组时:
API | Behavior |
---|---|
May return 可能返回0,表示device 不支持子组。 | |
clGetDeviceInfo, passing | Returns CL_FALSE if device does not support sub-groups. 如果设备不支持子组,则返回CL_FALSE。 |
clGetDeviceInfo, passing | Will not describe support for the cl_khr_subgroups extension if device does not support sub-groups. 如果device 不支持子组,则不会描述对cl_khr_subgroups扩展的支持。 |
Returns CL_INVALID_OPERATION if device does not support sub-groups. |
OpenCL C compilers supporting sub-groups will define the feature macro __opencl_c_subgroups
.
支持子组的OpenCL C编译器将定义功能宏__opencl_c_subgroups
。
Program Initialization and Clean-Up Kernels
程序初始化和清理内核
Program initialization and clean-up kernels are not supported in OpenCL 3.0, and the APIs and queries for program initialization and clean-up kernels are deprecated in OpenCL 3.0. When program initialization and clean-up kernels are not supported:
OpenCL 3.0不支持程序初始化和清理内核,OpenCL 3.0中不推荐使用程序初始化和清除内核的API和查询。当程序初始化和清理内核不受支持时:
API | Behavior |
---|---|
clGetProgramInfo, passing | Returns CL_FALSE if no devices in the context associated with program support program initialization and clean-up kernels. |
Returns CL_INVALID_OPERATION if no devices in the context associated with program support program initialization and clean-up kernels. |
3D Image Writes
3D图像写入
Kernel built-in functions for writing to 3D image objects are optional for devices supporting OpenCL 3.0. When writing to 3D image objects is not supported:
用于写入3D图像对象的内核内置函数对于支持OpenCL 3.0的设备是可选的。当不支持写入3D图像对象时:
API | Behavior |
---|---|
clGetDeviceInfo, passing | Will not describe support for the cl_khr_3d_image_writes extension if device does not support writing to 3D image objects. |
clGetSupportedImageFormats, passing | Returns an empty set (such as num_image_formats equal to 返回一个空集(例如num_image_formats等于0),表示如果context 中没有设备支持写入3D图像对象,则不支持任何图像格式。 |
OpenCL C compilers supporting writing to 3D image objects will define the feature macro __opencl_c_3d_image_writes
.
支持写入3D图像对象的OpenCL C编译器将定义特征宏__opencl_c_3d_image_writes
。
Work-group Collective Functions
工作组集体职能
Work-group collective functions for broadcasts, scans, and reductions are optional for devices supporting OpenCL 3.0. When work-group collective functions are not supported:
对于支持OpenCL 3.0的设备,广播、扫描和缩减的工作组集体功能是可选的。当不支持工作组集体功能时:
API | Behavior |
---|---|
clGetDeviceInfo, passing | May return CL_FALSE, indicating that device does not support work-group collective functions. |
OpenCL C compilers supporting work-group collective functions will define the feature macro __opencl_c_work_group_collective_functions
.
支持工作组集合函数的OpenCL C编译器将定义特征宏__opencl_c_work_group_collective_functions
。
Generic Address Space
通用地址空间
Support for the generic address space is optional for devices supporting OpenCL 3.0. When the generic address space is not supported:
对于支持OpenCL 3.0的设备,对通用地址空间的支持是可选的。当不支持通用地址空间时:
API | Behavior |
---|---|
clGetDeviceInfo, passing | May return CL_FALSE, indicating that device does not support the generic address space. |
OpenCL C compilers supporting the generic address space will define the feature macro __opencl_c_generic_address_space
.
支持通用地址空间的OpenCL C编译器将定义功能宏__opencl_c_generic_address_space
。
Language Features That Were Already Optional
已经是可选的语言功能
Some OpenCL C language features were already optional before OpenCL 3.0, the API mechanisms for querying these have not changed.
在OpenCL 3.0之前,一些OpenCL C语言功能已经是可选的,用于查询这些功能的API机制没有改变。
New feature macros for these optional features have been added to OpenCL C to provide a consistent mechanism for using optional features in OpenCL C 3.0. OpenCL C compilers supporting images will define the feature macro __opencl_c_images
. OpenCL C compilers supporting the double
type will define the feature macro __opencl_c_fp64
. OpenCL C compilers supporting the long
, unsigned long
and ulong
types will define the feature macro __opencl_c_int64
, note that compilers for FULL_PROFILE devices must support these types and define the macro unconditionally.
这些可选功能的新功能宏已添加到OpenCL C中,为在OpenCL C 3.0中使用可选功能提供了一致的机制。支持镜像的OpenCL C编译器将定义特征宏__opencl_c_images
。支持double类型的OpenCL C编译器将定义特征宏__opencl_c_fp64
。支持long、unsigned long和ulong类型的OpenCL C编译器将定义功能宏__opencl_c_int64
,请注意,FULL_PROFILE设备的编译器必须支持这些类型并无条件定义宏。