【2014-11-23】Heterogeneous Parallel Programming – Section 1

最新推荐文章于 2021-02-03 12:12:23 发布

转载最新推荐文章于 2021-02-03 12:12:23 发布 · 79 阅读

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/sjtujoe/p/4117512.html

本文对比了CPU与GPU的设计理念，CPU强调低延迟操作，具备强大的算术逻辑单元及大量缓存来缩短内存访问时间；而GPU则侧重高吞吐量，通过简单的控制逻辑和大量的并行处理单元实现高效的数据处理。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Powerful ALU
- Reduced operation latency
Large caches
- convert long latency memory accesses to short latency cache accesses
Sophisticated control
- Branch prediciton for reduced branch latency
- Data forwarding for reduced data latency

GPU: Throughput Oriented Design

Small caches
- To boost memory throughput
Simple control
- No branch prediction
- No data forwarding
Energy efficient ALUs
- Many long latency but heavily pipelined for high throughput

Scalability