memory wall/Spatial locality/Temporal locality/Memory Latency/

本文探讨了内存带宽、空间局部性和时间局部性等概念对于提升系统性能的重要性。文章详细解释了如何通过改善数据访问模式来减少内存延迟并提高缓存效率。

 

    Generally speaking, memory bus bandwidth has not seen the same improvement as CPU performance (an observation sometimes referred to as the memory wall), and with multi-coreand many-core systems, the available bandwidth is shared between all cores. This makes preservation of memory bandwidth one of the most important tasks in achieving top performance.

 

Spatial locality

Spatial locality refers to the desirable property of accessing close memory locations. Poor spatial locality is penalized in a number of ways:

    Accessing data very sparsely will have the negative side effect of transferring unused data over the memory bus (since data travels in chunks). This raises the memory bandwidth requirements, and in practice imposes a limit on the application performance and scalability.

    Unused data will occupy the caches, reducing the effective cache size. This causes more frequent evictions and more round trips to memory.

    Unused data will reduce the likelihood of encountering more than one useful piece of data in a mapped cache line.

 

Temporal locality

Temporal locality relates to reuse of data. Reusing data while it is still in the cache avoids sustaining memory fetch stalls and generally reduces the memory bus load.

 

Memory Latency

The times it takes to initiate a memory fetch is referred to as memory latency. During this time, the CPU is stalled. The latency penalty is on the order of 100 clock cycles.

Caches were invented to hide such problems, by serving a limited amount of data from a small but fast memory. This is effective if the data set can be coerced to fit in the cache.

A different technique is called prefetching, where data transfer is initiated explicitly or automatically ahead of when the data is needed. Hopefully, the data will have reached the cache by the time it is needed.

 

Avoiding cache pollution

If a dataset has a footprint larger than the available cache, and there is no practical way to reorganize the access patterns to improve reuse, there is no benefit from storing that data in the cache in the first place. Some CPUs have special instructions for bypassing caches, exactly for this purpose.

 

 

 

from wiki

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值