OpenCL™ C 6.15.7. 矢量数据加载和存储功能

6.15.7. Vector Data Load and Store Functions
6.15.7. 矢量数据加载和存储功能

The Built-in Vector Data Load and Store Functions table describes the list of supported functions that allow you to read and write vector types from a pointer to memory.

​内置矢量数据加载和存储函数表描述了支持的函数列表,这些函数允许从指向内存的指针读取和写入向量类型。

The generic type name gentype indicates that the function can take any of

泛型类型名称gentype表示该函数可以采用以下任何类型

  • charucharshortushortintuintlong [57] or ulong

  • float or double [39]

  • half [58]

All functions taking or returning half types are supported only when the cl_khr_fp16 extension macro is supported.

仅当支持cl_khr_fp16扩展宏时,才支持所有接受或返回half类型的函数。

as the type for the arguments.

作为参数的类型。

The generic type name gentypen indicates an n-element vector of gentype elements.

泛型类型名称gentypen表示gentype元素的n元素矢量。

The generic type name halfn indicates an n-element vector of half elements.

泛型类型名称halfn表示half元素的n元素矢量。

The suffix n is also used in the function names (i.e. vloadnvstoren etc.), where n = 2, 3 [59], 4, 8 or 16.

​后缀n也用于函数名(即vloadn、vstoren等),其中n=2、3[59]、4、8或16。

Table 19. Built-in Vector Data Load and Store Functions

表19 内置矢量数据加载和存储函数

Function

函数

Description

描述

gentypen vloadn(size_t offset, const __global gentype *p)
gentypen vloadn(size_t offset, const __local gentype *p)
gentypen vloadn(size_t offset, const __constant gentype *p)
gentypen vloadn(size_t offset, const __private gentype *p)

For OpenCL C 2.0, or OpenCL C 3.0 or newer with the __opencl_c_generic_address_space feature:

gentypen vloadn(size_t offset, const gentype *p)

Return sizeof(gentypen) bytes of data, where the first (n * sizeof(gentype)) bytes are read from the address computed as (p + (offset * n)). The computed address must be 8-bit aligned if gentype is char or uchar; 16-bit aligned if gentype is halfshort or ushort; 32-bit aligned if gentype is intuint, or float; and 64-bit aligned if gentype is long or ulong.

返回sizeof(gentypen)字节的数据,其中第一个(n*sizeof(gentype))字节是从计算为(p+(offset*n))的地址读取的。如果gentype是char或uchar,则计算的地址必须是8位对齐的;如果gentype为halfshortushort,则对齐16位;如果gentype是int、uint或float,则对齐32位;如果gentype是long或ulong,则对齐64位。

void vstoren(gentypen data, size_t offset, __global gentype *p)
void vstoren(gentypen data, size_t offset, __local gentype *p)
void vstoren(gentypen data, size_t offset, __private gentype *p)

For OpenCL C 2.0, or OpenCL C 3.0 or newer with the __opencl_c_generic_address_space feature:

void vstoren(gentypen data, size_t offset, gentype *p)

Write n * sizeof(gentype) bytes given by data to the address computed as (p + (offset * n)). The computed address must be 8-bit aligned if gentype is char or uchar; 16-bit aligned if gentype is halfshort or ushort; 32-bit aligned if gentype is intuint, or float; and 64-bit aligned if gentype is long or ulong.

将数据给出的n*sizeof(gentype)字节写入计算为(p+(offset*n))的地址。如果gentype是char或uchar,则计算的地址必须是8位对齐的;如果gentype为halfshortushort,则对齐16位;如果gentype是int、uint或float,则对齐32位;如果gentype是long或ulong,则对齐64位。

float vload_half(size_t offset, const __global half *p)
float vload_half(size_t offset, const __local half *p)
float vload_half(size_t offset, const __constant half *p)
float vload_half(size_t offset, const __private half *p)

For OpenCL C 2.0, or OpenCL C 3.0 or newer with the __opencl_c_generic_address_space feature:

float vload_half(size_t offset, const half *p)

Read sizeof(half) bytes of data from the address computed as (p + offset). The data read is interpreted as a half value. The half value is converted to a float value and the float value is returned. The computed read address must be 16-bit aligned.

从计算为(p+offset)的地址读取数据的sizeof(half)字节。读取的数据被解释为half值。half值转换为float值,并返回float值。计算出的读取地址必须对齐16位。

floatn vload_halfn(size_t offset, const __global half *p)
floatn vload_halfn(size_t offset, const __local half *p)
floatn vload_halfn(size_t offset, const __constant half *p)
floatn vload_halfn(size_t offset, const __private half *p)

For OpenCL C 2.0, or OpenCL C 3.0 or newer with the __opencl_c_generic_address_space feature:

floatn vload_halfn(size_t offset, const half *p)

Read (n * sizeof(half)) bytes of data from the address computed as (p + (offset * n)). The data read is interpreted as a halfn value. The halfn value read is converted to a floatn value and the floatn value is returned. The computed read address must be 16-bit aligned.

从计算为(p + (offset * n))的地址读取(n * sizeof(half))字节的数据。读取的数据被解释为halfn值。读取的halfn值被转换为floatn值,并返回floatn值。计算出的读取地址必须对齐16位。

void vstore_half(float data, size_t offset, __global half *p)
void vstore_half_rte(float data, size_t offset, __global half *p)
void vstore_half_rtz(float data, size_t offset, __global half *p)
void vstore_half_rtp(float data, size_t offset, __global half *p)
void vstore_half_rtn(float data, size_t offset, __global half *p)

void vstore_half(float data, size_t offset, __local half *p)
void vstore_half_rte(float data, size_t offset, __local half *p)
void vstore_half_rtz(float data, size_t offset, __local half *p)
void vstore_half_rtp(float data, size_t offset, __local half *p)
void vstore_half_rtn(float data, size_t offset, __local half *p)

void vstore_half(float data, size_t offset, __private half *p)
void vstore_half_rte(float data, size_t offset, __private half *p)
void vstore_half_rtz(float data, size_t offset, __private half *p)
void vstore_half_rtp(float data, size_t offset, __private half *p)
void vstore_half_rtn(float data, size_t offset, __private half *p)

For OpenCL C 2.0, or OpenCL C 3.0 or newer with the __opencl_c_generic_address_space feature:

void vstore_half(float data, size_t offset, half *p)
void vstore_half_rte(float data, size_t offset, half *p)
void vstore_half_rtz(float data, size_t offset, half *p)
void vstore_half_rtp(float data, size_t offset, half *p)
void vstore_half_rtn(float data, size_t offset, half *p)

The float value given by data is first converted to a half value using the appropriate rounding mode. The half value is then written to the address computed as (p + offset). The computed address must be 16-bit aligned.

数据给出的浮点值首先使用适当的舍入模式转换为半值。然后将半值写入计算为(p+offset)的地址。计算出的地址必须对齐16位。

vstore_half uses the default rounding mode. The default rounding mode is round to nearest even.

vstore_half使用默认的舍入模式。默认的舍入模式是四舍五入到最接近的偶数。

void vstore_halfn(floatn data, size_t offset, __global half *p)
void vstore_halfn_rte(floatn data, size_t offset, __global half *p)
void vstore_halfn_rtz(floatn data, size_t offset, __global half *p)
void vstore_halfn_rtp(floatn data, size_t offset, __global half *p)
void vstore_halfn_rtn(floatn data, size_t offset, __global half *p)

void vstore_halfn(floatn data, size_t offset, __local half *p)
void vstore_halfn_rte(floatn data, size_t offset, __local half *p)
void vstore_halfn_rtz(floatn data, size_t offset, __local half *p)
void vstore_halfn_rtp(floatn data, size_t offset, __local half *p)
void vstore_halfn_rtn(floatn data, size_t offset, __local half *p)

void vstore_halfn(floatn data, size_t offset, __private half *p)
void vstore_halfn_rte(floatn data, size_t offset, __private half *p)
void vstore_halfn_rtz(floatn data, size_t offset, __private half *p)
void vstore_halfn_rtp(floatn data, size_t offset, __private half *p)
void vstore_halfn_rtn(floatn data, size_t offset, __private half *p)

For OpenCL C 2.0, or OpenCL C 3.0 or newer with the __opencl_c_generic_address_space feature:

void vstore_halfn(floatn data, size_t offset, half *p)
void vstore_halfn_rte(floatn data, size_t offset, half *p)
void vstore_halfn_rtz(floatn data, size_t offset, half *p)
void vstore_halfn_rtp(floatn data, size_t offset, half *p)
void vstore_halfn_rtn(floatn data, size_t offset, half *p)

The floatn value given by data is converted to a halfn value using the appropriate rounding mode. n * sizeof(half) bytes from the halfn value are then written to the address computed as (p + (offset * n)). The computed address must be 16-bit aligned.

数据给出的floatn值使用适当的舍入模式转换为halfn值。然后将来自halfn值的n*sizeof(半)字节写入计算为(p+(偏移*n))的地址。计算出的地址必须对齐16位。

vstore_halfn uses the default rounding mode. The default rounding mode is round to nearest even.

vstore_halfn使用默认的舍入模式。默认的舍入模式是四舍五入到最接近的偶数。

void vstore_half(double data, size_t offset, __global half *p)
void vstore_half_rte(double data, size_t offset, __global half *p)
void vstore_half_rtz(double data, size_t offset, __global half *p)
void vstore_half_rtp(double data, size_t offset, __global half *p)
void vstore_half_rtn(double data, size_t offset, __global half *p)

void vstore_half(double data, size_t offset, __local half *p)
void vstore_half_rte(double data, size_t offset, __local half *p)
void vstore_half_rtz(double data, size_t offset, __local half *p)
void vstore_half_rtp(double data, size_t offset, __local half *p)
void vstore_half_rtn(double data, size_t offset, __local half *p)

void vstore_half(double data, size_t offset, __private half *p)
void vstore_half_rte(double data, size_t offset, __private half *p)
void vstore_half_rtz(double data, size_t offset, __private half *p)
void vstore_half_rtp(double data, size_t offset, __private half *p)
void vstore_half_rtn(double data, size_t offset, __private half *p)

For OpenCL C 2.0, or OpenCL C 3.0 or newer with the __opencl_c_generic_address_space feature:

void vstore_half(double data, size_t offset, half *p)
void vstore_half_rte(double data, size_t offset, half *p)
void vstore_half_rtz(double data, size_t offset, half *p)
void vstore_half_rtp(double data, size_t offset, half *p)
void vstore_half_rtn(double data, size_t offset, half *p)

The double value given by data is first converted to a half value using the appropriate rounding mode. The half value is then written to the address computed as (p + offset). The computed address must be 16-bit aligned.

数据给出的double值首先使用适当的舍入模式转换为half值。然后将half值写入计算为(p+offset)的地址。计算出的地址必须对齐16位。

vstore_half uses the default rounding mode. The default rounding mode is round to nearest even.

vstore_half使用默认的舍入模式。默认的舍入模式是四舍五入到最接近的偶数。

void vstore_halfn(doublen data, size_t offset, __global half *p)
void vstore_halfn_rte(doublen data, size_t offset, __global half *p)
void vstore_halfn_rtz(doublen data, size_t offset, __global half *p)
void vstore_halfn_rtp(doublen data, size_t offset, __global half *p)
void vstore_halfn_rtn(doublen data, size_t offset, __global half *p)

void vstore_halfn(doublen data, size_t offset, __local half *p)
void vstore_halfn_rte(doublen data, size_t offset, __local half *p)
void vstore_halfn_rtz(doublen data, size_t offset, __local half *p)
void vstore_halfn_rtp(doublen data, size_t offset, __local half *p)
void vstore_halfn_rtn(doublen data, size_t offset, __local half *p)

void vstore_halfn(doublen data, size_t offset, __private half *p)
void vstore_halfn_rte(doublen data, size_t offset, __private half *p)
void vstore_halfn_rtz(doublen data, size_t offset, __private half *p)
void vstore_halfn_rtp(doublen data, size_t offset, __private half *p)
void vstore_halfn_rtn(doublen data, size_t offset, __private half *p)

For OpenCL C 2.0, or OpenCL C 3.0 or newer with the __opencl_c_generic_address_space feature:

void vstore_halfn(doublen data, size_t offset, half *p)
void vstore_halfn_rte(doublen data, size_t offset, half *p)
void vstore_halfn_rtz(doublen data, size_t offset, half *p)
void vstore_halfn_rtp(doublen data, size_t offset, half *p)
void vstore_halfn_rtn(doublen data, size_t offset, half *p)

The doublen value given by data is converted to a halfn value using the appropriate rounding mode. n * sizeof(half) bytes from the halfn value are then written to the address computed as (p + (offset * n)). The computed address must be 16-bit aligned.

使用适当的舍入模式将数据给出的doublen值转换为halfn。然后将来自halfn值的n*sizeof(半)字节写入计算为(p + (offset * n))的地址。计算出的地址必须对齐16位。

vstore_halfn uses the default rounding mode. The default rounding mode is round to nearest even.

vstore_halfn使用默认的舍入模式。默认的舍入模式是四舍五入到最接近的偶数。

floatn vloada_halfn(size_t offset, const __global half *p)
floatn vloada_halfn(size_t offset, const __local half *p)
floatn vloada_halfn(size_t offset, const __constant half *p)
floatn vloada_halfn(size_t offset, const __private half *p)

For OpenCL C 2.0, or OpenCL C 3.0 or newer with the __opencl_c_generic_address_space feature:

floatn vloada_halfn(size_t offset, const half *p)

For n = 2, 4, 8 and 16, read sizeof(halfn) bytes of data from the address computed as (p + (offset * n)). The data read is interpreted as a halfn value. The halfn value read is converted to a floatn value and the floatn value is returned. The computed address must be aligned to sizeof(halfn) bytes.

对于n=2、4、8和16,从计算为(p + (offset * n))的地址读取sizeof(halfn)字节的数据。读取的数据被解释为halfn值。读取的halfn值被转换为floatn值,并返回floatn值。计算出的地址必须与sizeof(halfn)字节对齐。

For n = 3, vloada_half3 reads a half3 from the address computed as (p + (offset * 4)) and returns a float3. The computed address must be aligned to sizeof(half) * 4 bytes.

对于n=3,vloada_half3从计算为(p + (offset * 4))的地址读取一个half3,并返回一个float3。计算出的地址必须与sizeof(half) * 4字节对齐。

void vstorea_halfn(floatn data, size_t offset, __global half *p)
void vstorea_halfn_rte(floatn data, size_t offset, __global half *p)
void vstorea_halfn_rtz(floatn data, size_t offset, __global half *p)
void vstorea_halfn_rtp(floatn data, size_t offset, __global half *p)
void vstorea_halfn_rtn(floatn data, size_t offset, __global half *p)

void vstorea_halfn(floatn data, size_t offset, __local half *p)
void vstorea_halfn_rte(floatn data, size_t offset, __local half *p)
void vstorea_halfn_rtz(floatn data, size_t offset, __local half *p)
void vstorea_halfn_rtp(floatn data, size_t offset, __local half *p)
void vstorea_halfn_rtn(floatn data, size_t offset, __local half *p)

void vstorea_halfn(floatn data, size_t offset, __private half *p)
void vstorea_halfn_rte(floatn data, size_t offset, __private half *p)
void vstorea_halfn_rtz(floatn data, size_t offset, __private half *p)
void vstorea_halfn_rtp(floatn data, size_t offset, __private half *p)
void vstorea_halfn_rtn(floatn data, size_t offset, __private half *p)

For OpenCL C 2.0, or OpenCL C 3.0 or newer with the __opencl_c_generic_address_space feature:

void vstorea_halfn(floatn data, size_t offset, half *p)
void vstorea_halfn_rte(floatn data, size_t offset, half *p)
void vstorea_halfn_rtz(floatn data, size_t offset, half *p)
void vstorea_halfn_rtp(floatn data, size_t offset, half *p)
void vstorea_halfn_rtn(floatn data, size_t offset, half *p)

The floatn value given by data is converted to a halfn value using the appropriate rounding mode.

数据给出的floatn值使用适当的舍入模式转换为halfn值。

For n = 2, 4, 8 and 16, the halfn value is written to the address computed as (p + (offset * n)). The computed address must be aligned to sizeof(halfn) bytes.

对于n=2、4、8和16,halfn值被写入计算为(p + (offset * n))的地址。计算出的地址必须与sizeof(halfn)字节对齐。

For n = 3, the half3 value is written to the address computed as (p + (offset * 4)). The computed address must be aligned to sizeof(half) * 4 bytes.

对于n=3,将half3值写入计算为(p + (offset * 4))的地址。计算出的地址必须与sizeof(half) * 4 字节对齐。

vstorea_halfn uses the default rounding mode. The default rounding mode is round to nearest even.

vstorea_halfn使用默认的舍入模式。默认的舍入模式是四舍五入到最接近的偶数。

void vstorea_halfn(doublen data, size_t offset, __global half *p)
void vstorea_halfn_rte(doublen data, size_t offset, __global half *p)
void vstorea_halfn_rtz(doublen data, size_t offset, __global half *p)
void vstorea_halfn_rtp(doublen data, size_t offset, __global half *p)
void vstorea_halfn_rtn(doublen data, size_t offset, __global half *p)

void vstorea_halfn(doublen data, size_t offset, __local half *p)
void vstorea_halfn_rte(doublen data, size_t offset, __local half *p)
void vstorea_halfn_rtz(doublen data, size_t offset, __local half *p)
void vstorea_halfn_rtp(doublen data, size_t offset, __local half *p)
void vstorea_halfn_rtn(doublen data, size_t offset, __local half *p)

void vstorea_halfn(doublen data, size_t offset, __private half *p)
void vstorea_halfn_rte(doublen data, size_t offset, __private half *p)
void vstorea_halfn_rtz(doublen data, size_t offset, __private half *p)
void vstorea_halfn_rtp(doublen data, size_t offset, __private half *p)
void vstorea_halfn_rtn(doublen data, size_t offset, __private half *p)

For OpenCL C 2.0, or OpenCL C 3.0 or newer with the __opencl_c_generic_address_space feature:

void vstorea_halfn(doublen data, size_t offset, half *p)
void vstorea_halfn_rte(doublen data, size_t offset, half *p)
void vstorea_halfn_rtz(doublen data, size_t offset, half *p)
void vstorea_halfn_rtp(doublen data, size_t offset, half *p)
void vstorea_halfn_rtn(doublen data, size_t offset, half *p)

The doublen value is converted to a halfn value using the appropriate rounding mode.

使用适当的舍入模式将doublen值转换为halfn值。

For n = 2, 4, 8 or 16, the halfn value is written to the address computed as (p + (offset * n)). The computed address must be aligned to sizeof(halfn) bytes.

对于n=2、4、8或16,halfn值被写入计算为(p + (offset * n))的地址。计算出的地址必须与sizeof(halfn)字节对齐。

For n = 3, the half3 value is written to the address computed as (p + (offset * 4)). The computed address must be aligned to sizeof(half) * 4 bytes.

对于n=3,将half3值写入计算为(p + (offset * 4))的地址。计算出的地址必须与sizeof(half) * 4字节对齐。

vstorea_halfn uses the default rounding mode. The default rounding mode is round to nearest even.

vstorea_halfn使用默认的舍入模式。默认的舍入模式是四舍五入到最接近的偶数。

The results of vector data load and store functions are undefined if the address being read from or written to is not correctly aligned as described in Built-in Vector Data Load and Store Functions. The pointer argument p can be a pointer to globallocal, or private memory for store functions described in Built-in Vector Data Load and Store Functions. The pointer argument p can be a pointer to globallocalconstant, or private memory for load functions described in Built-in Vector Data Load and Store Functions.

​如果从中读取或写入的地址没有按照内置矢量数据加载和存储函数中的描述正确对齐,则矢量数据加载与存储函数的结果是未定义的。指针参数p可以是指向全局、本地或私有内存的指针,用于内置矢量数据加载和存储函数中描述的存储函数。指针参数p可以是指向全局、局部、常量或私有内存的指针,用于内置矢量数据加载和存储函数中描述的加载函数。

The vector data load and store functions variants that take pointer arguments which point to the generic address space are also supported.

还支持矢量数据加载和存储函数变量,这些变量接受指向通用地址空间的指针参数。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值