有时我们需要在pycuda里传递复数进行处理,C 语言中加一个头文件就好了,超简单哦!
比起我之前用的那个简单多了,还可以调用cuComplex.h里的函数,如取实部,虚部等等。
看看代码小例子吧!
import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule
import numpy as np
mod = SourceModule("""
#include<cuComplex.h>
__global__ void AHE(cuFloatComplex *a, cuFloatComplex *b,int row)
{
int i = threadIdx.y + blockDim.y * blockIdx.y;
int j = threadIdx.x + blockDim.x * blockIdx.x;
const int idx = i + j*row;
b[idx] = a[idx] ;
}
""")
AHE = mod.get_function("AHE")
img =np.random.randn(4, 4).astype(np.complex128)
print (img)
row = np.int32(img.shape[-1])
out = img.copy()
out[:] = 0
out = np.complex128(out)
col = np.complex128(col)
AHE(cuda.In(img),cuda.InOut(out), row, row, block=(32,32,1),grid=(1,1))
print (out)
看看实验结果:
完全正确。