空间
__global__
1)Executed on the device,
2)Callable from the host,
3)Callable from the device for devices of compute capability 3.2 以上
2. 必须返回void
3. 异步(asynchronous),调用后控制权返回给host
__device__
1.
1)Executed on the device,
2)Callable from the device only.
4. __global__
和__device__
不能同时使用
__host__
1.
1)Executed on the host,
2)Callable from the host only.
2.默认可以不写
3.__global__
and__host__
不能同时出现
4.__device__
and __host__
可以同时出现
配置
kernel_name<<< Dg, Db, Ns, S >>>([kernel arguments]);
通过<<< Dg, Db, Ns, S >>>
来配置cuda函数
配置参数 | 描述 |
---|---|
Dg | grid dim (dim3) Dg.x * Dg.y * Dg.z |
Db | block dim (dim3) Db.x * Db.y * Db.z |
Ns | number of bytes (size_t) |
S | cudaStream_t,是否加入流 |
例子:
函数配置
__global__ void Func(float* parameter);
Func<<< Dg, Db, Ns >>>(parameter);
参考:
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#function-declaration-specifiers
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#device-side-kernel-launch