PyCUDA: pagelocked memory
In GPU Programming, we have to transfer data from CPU to GPU which might take a while.
Normal ways of transferring data:
import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule
import numpy as np
a = np.random.randn(256, 256)
a = a.astype
a_gpu = cuda.mem_alloc(a.nbytes)
cuda.memcpy_htod(a_gpu, a)
##following codes ...