CUDA:从PTX代码使用驱动程序API实例 ptxjit_kernel.cu ptxjit.cpp ptxjit_kernel.cu extern "C" __global__ void myKernel(int *data) { int tid = blockIdx.x * blockDim.x + threadIdx