Suitability of recent hardware accelerators (DSPs, FPGAs, and GPUs)

最新推荐文章于 2025-12-02 20:19:38 发布

原创最新推荐文章于 2025-12-02 20:19:38 发布 · 965 阅读

17 ·

CC 4.0 BY-SA版权

文章标签：

#fpga开发

论文专栏收录该内容

1 篇文章

订阅专栏

文章讨论了DSPs,FPGAs和GPUs在不同应用中的优缺点。DSPs适合低功耗、低成本的嵌入式系统，不适合高数据吞吐量的复杂计算。FPGAs适用于高性能、并行处理的任务，适合大规模生产，特别是在图像处理中。GPUs则提供良好的性价比，适用于图像和视频处理，但功率消耗较高，依赖于高效的内存管理。

DSPs

Types of DSPs

Advantages and disadvantages of DSPs

Remarks

1.The type of arithmetic computation support needs to be considered in the selection of DSP.

SYS/BIOS is a real-time operating ststem for programming its DPS

DSPs Summary

DSP not particularly suitable for high-performance applications. Despite recent advances in TI-multicore DSPs [38], it still is not feasible to implement complex computer vision and image processing algorithms on DSPs, especially when the data throughput requirement is high. Furthermore, DSPs are not particularly suitable for PC-based systems since, apart from external interfaces, they do not provide significant advantages over GPUs. In contrast, DSPs are an energy-efficient and cost-effective solution for embedded systems, and for mobile or portable devices in which the computational demands are not high, and the power consumption level is critical

FPGAs

Xilinx and Altera

Features for selection of FPGAs

Advantages of using FPGAs

Remarks

1.FPGA:incorporates arrays of reprogrammaable logic gates

Hardware description languages,such as VHDL and Verilog. However, HDLs are low-level and complicated programming languages. Some high-level programming language.

it is challenging to manage data communication and computations between multiple GPUs

FPGA Summary

achieving an acceptable performance in FPGAs is only possible when the algorithm is modified and optimised for parallel processing

FPGAs are the best option for algorithms with high computational demands in a portable PC-independent device. FPGAs are low-power, can be used in embedded systems, and are designed for high performance tasks.

For designs that will be mass produced, FPGAs are suitable options, since an ASIC can easily be designed and produced based on an FPGA design. ASICs substantially reduce the costs for mass production.

FPGAs are the most suitable option for capturing and processing high-frame-rate data from high-speed cameras.

GPUs

Remarks

The most recent NVidiamicroarchitectures were named Tesla, Fermi, Kepler, Maxwell, Pascal, and Volta.

Advantages of using GPUs

GPU are relatively inexpensive compared to FPGAs, and have the best processing power to price ratio among hardware acceleartors.

designed for performing image and video processing.

high-level programming languages,developing and debuging code in GPUs is faster and easier than in FPGAs.

PCIe interface between GPU and CPU can easily used

Disadvantages of using GPUs

GPUs consume significantly more power compared to FPGAs

The performance of GPUs will decrease considerably if they have to wait for data.

The main speed bottleneck in using GPUs in PC-based systems is the data transfer time between the host PC and the GPU.

the management of memory and the choice between shared memory, local memory, global memory, constant memory, and texture memory are not straightforward for high performance applications.

Double precision calculations are theoretically around two times slower than single precision calculations

The GPU’s hardware is pre-structured and has a lower flexibility compared to that of FPGAs.

Portability of software over different hardware

transferring codes among FPGAs

Transferring codes among Xilinx FPGAs and Altera FPGAs is no tstraightforward. Even though the FPGA code may be written in the VHDL language (which is the basic language for both), Xilinx Vivadoand Altera Quartus II use different approaches. Furthermore, Xilinx and Altera have different IP-cores that are specifically designed for their own FPGAs.

It is relatively simple to transfer codes from a lower-performance FPGA to a higher-performance FPGA of the same series in both Xilinx and Altera FPGAs. It is only required to re-synthesise the code for the new hardware. The re-synthesised code can take advantage of the new hardware capabilities of the target FPGA. Nevertheless, the code needs to be re-written to use sequential processing if the target FPGA has an embedded processor.

Transferring code from a higher-performance FPGA to a lower performance FPGA is not difficult if the lower performance FPGA has sufficient logic units for the code, and the code is not using specific hardware units of the higher-performance FPGA (such as the embedded processor). The only required step is to re-synthesise the code for the new FPGA.

Transferring codes among GPUs

The transfer of code from a lower-performance GPU to a higher-performance GPU only requires recompilation of the code. Nevertheless, the code needs to be rewritten to use any of the new capabilities of the target GPU.

Transferring code from a higher-performance GPU to a lower per-formance GPU only requires recompilation of the code if the lower performance GPU has sufficient memory blocks and uses the same CUDA compute compatibility.