大毕设-CUDA-cuFFT库

Computing a number BATCH of one-dimensional DFTs of size NX using cuFFT will typically look like this:

在CUDA上实现DFT算法大概会是这个样子的流程:

#define NX 256
#define BATCH 10
#define RANK 1
...
{
    cufftHandle plan;
    cufftComplex *data;
    ...
    cudaMalloc((void**)&data, sizeof(cufftComplex)*NX*BATCH);
    cufftPlanMany(&plan, RANK, NX, &iembed, istride, idist, 
        &oembed, ostride, odist, CUFFT_C2C, BATCH);
    ...
    cufftExecC2C(plan, data, data, CUFFT_FORWARD);
    cudaDeviceSynchronize();
    ...
    cufftDestroy(plan);
    cudaFree(data);
}

2.1访问cuFFT
   cuFFT和cuFFTW库是免费共享的。它们已经完成了只要我们引用就行,cuFFT can be downloaded from http://developer.nvidia.com/cufft. By selecting Download CUDA Production Release users are all able to install the package containing the CUDA Toolkit, SDK code samples and development drivers. The CUDA Toolkit contains cuFFT and the samples include simplecuFFT。可以在。。。下载安装,simplecuFFT是例程。2.2傅里叶变换类型

  • cufftPlan1D() / cufftPlan2D() / cufftPlan3D() - Create a simple plan for a 1D/2D/3D transform respectively.
  • 创建一个一维/二维/三维傅里叶变换为的简单计划。
  • cufftPlanMany() - Creates a plan supporting batched input and strided data layouts.
  • 创建一个计划,支持批量输入和分析数据布局?
  • cufftXtMakePlanMany() - Creates a plan supporting batched input and strided data layouts for any supported precision.
  • 创建一个计划,支持批量输入和数据布局的任何支持的精度分析
    在以上的三个几哈生成函数中,cufftPlanMany() 

2.3 傅里叶变换类型
    

  • cufftexecc2c cufftexecz2z()/()-单/双精度复杂变换。
  • cufftexecr2c cufftexecd2z()/()真正复杂的正变换单/双精度。
  • cufftexecc2r cufftexecz2d()/()复杂的单/双精度实反变换。

Each of those functions demands different input data layout (see Data Layout for details).

Functions cufftXtExec() and cufftXtExecDescriptor() can perform transforms on any of the supported types.


2.4数据布局

FFT typeinput data sizeoutput data size
C2CcufftComplexcufftComplex
C2Rx 2 + 1 cufftComplexcufftReal
R2C*cufftRealx 2 + 1 cufftComplex



2.5多维过滤和转换
2.6提前进行数据布局
   

2.7. 流傅里叶变换

   每个cuFFT变换可能都与CUDA流有关,

   2.8多GPU同时运行的cuFFT变换

   CUFFT支持使用多达八个GPU和一个CPU连接进行傅里叶变换的计算 。我们定义了一个函数API使用户能够修改咸鱼代码或者写新的代码。

  • cufftCreate() - create an empty plan, as in the single GPU case
  • 创建一个空的计划在单GPU的情况下
  • cufftXtSetGPUs() - define which GPUs are to be used
  • 指定哪个GPU被使用
  • Optional: cufftEstimate{1d,2d,3d,Many}() - estimate the sizes of the work areas required. These are the same functions used in the single GPU case although the definition of the argument workSize reflects the number of GPUs used.
  • 估计所需的工作区域的大小。这些是相同的函数中使用单一GPU案例虽然论证workSize的定义反映了GPU的数量。
  • cufftMakePlan{1d,2d,3d,Many}() - create the plan. These are the same functions used in the single GPU case although the definition of the argument workSize reflects the number of GPUs used.
  • Optional: cufftGetSize{1d,2d,3d,Many}() - refined estimate of the sizes of the work areas required. These are the same functions used in the single GPU case although the definition of the argument workSize reflects the number of GPUs used.
  • Optional: cufftGetSize() - check workspace size. This is the same function used in the single GPU case although the definition of the argument workSize reflects the number of GPUs used.
  • Optional: cufftXtSetWorkArea() - do your own workspace allocation.
  • cufftXtMalloc() - allocate descriptor and data on the GPUs
  • GPU分配内存
  • cufftXtMemcpy() - copy data to the GPUs
  • 数据拷贝至GPU
  • cufftXtExecDescriptorC2C()/cufftXtExecDescriptorZ2Z() - execute the plan
  • cufftXtMemcpy() - copy data from the GPUs
  • 从GPU拷贝数据
  • cufftXtFree() - free any memory allocated with cufftXtMalloc()
  • 释放和分配内存
  • cufftDestroy() - free cuFFT plan resources

      释放cuFFT计划

 

转载于:https://www.cnblogs.com/luoqingyu/p/6337258.html

This document describes CUFFT, the NVIDIA® CUDA™ Fast Fourier Transform (FFT) library. The FFT is a divide-and-conquer algorithm for efficiently computing discrete Fourier transforms of complex or real-valued data sets. It is one of the most important and widely used numerical algorithms in computational physics and general signal processing. The CUFFT library provides a simple interface for computing parallel FFTs on an NVIDIA GPU, which allows users to leverage the floating-point power and parallelism of the GPU without having to develop a custom, CUDA FFT implementation. FFT libraries typically vary in terms of supported transform sizes and data types. For example, some libraries only implement radix-2 FFTs, restricting the transform size to a power of two. The CUFFT Library aims to support a wide range of FFT options efficiently on NVIDIA GPUs. This version of the CUFFT library supports the following features: I Complex and real-valued input and output I 1D, 2D, and 3D transforms I Batch execution for doing multiple transforms of any dimension in parallel I Transform sizes up to 64 million elements in single precision and up to 128 million elements in double precision in any dimension, limited by the available GPU memory I In-place and out-of-place transforms I Double-precision (64-bit floating point) on compatible hardware (sm1.3 and later) I Support for streamed execution, enabling asynchronous computation and data movement I FFTW compatible data layouts I Arbitrary intra- and inter-dimension element strides I Thread-safe API that can be called from multiple independent host threads
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值