STM32H7的Cache应用

本文详细解读了STM32H7微控制器的Cache配置,包括非缓存、透写/读分配、回写/读分配和回写/读写分配模式,以及如何通过API进行配置以优化性能,同时强调了在多任务处理中缓存一致性的问题和解决方法。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

系列文章目录

1.Cache相关知识介绍

2.STM32H7的Cache应用


目录

前言

相关资料说明

一、Cache支持的配置

1.1 配置总览

1.2 配置性能对比

二、四大配置

2.1 Non-cacheable

2.2 透写、读分配、无写分配(WT、RA)

2.3 回写、读分配、无写分配(WB、RA)

2.4 回写、读写分配(WB、RA、WA)

三、Cache相关API

四、总结



前言

        使用M7内核的MCU,为提升性能,此系列文章记录对STM32H7的MPU和Cache的学习。

        本文主要写的关于STM32H7上的Cache配置策略及应用

        资料链接:Cache相关资料手册资源-优快云文库


相关资料说明

        AN4838这本手册里有对MPU的介绍、MPU的相关寄存器配置、储存器类型的介绍(Memory types

        AN4839这本手册介绍Cache的相关知识,M7内核关于Cache的API函数、stm32h7和f7的默认配置、WB、WT等知识

        对于Cache和MPU的配置参考STM32H7编程手册的MPU access permission attributes章节表格71很完善的介绍了。

一、Cache支持的配置

1.1 配置总览

       Cache的配置是通过MPU来设置的, 在STM32H7的编程手册中找到如下表:

        各个Bit的含义:TEX是用来配置Cache策略的

        C是开关Cache、B是缓冲用来配合Cache设置,S是开关共享,是用来解决多总线或者多核访问时的同步问题(开启共享等同于关闭Cache)(TEX=010基本上不用,不关注

        TEX配置的Cache策略如下:即这个是配置WB、WT、WA、RA模式

Cache策略配置图

1.2 配置性能对比

        MPU设置的存储器类型(memory types):Normal memory > Device memory >Strongly ordered memory

        关于三种存储器类型的描述:(AN4838的Memory types章节

memory types描述

        注:M7内核只要开启了Cache,read allocate就是开启的。

        配置方法性能最强:Cache策略选择WA、RA、WB模式(即TEX=0b001),开启缓冲区B,关闭共享S。这种相当于读写Cache都开启了,需要注意数据不一致问题。

        例:配置AXI SRAM的为性能最强

   	/* 配置AXI SRAM的MPU属性为Write back, Read allocate,Write allocate */
	MPU_InitStruct.Enable           = MPU_REGION_ENABLE;
	MPU_InitStruct.BaseAddress      = 0x24000000;
	MPU_InitStruct.Size             = MPU_REGION_SIZE_512KB;
	MPU_InitStruct.AccessPermission = MPU_REGION_FULL_ACCESS;
	MPU_InitStruct.IsBufferable     = MPU_ACCESS_BUFFERABLE;
	MPU_InitStruct.IsCacheable      = MPU_ACCESS_CACHEABLE;
	MPU_InitStruct.IsShareable      = MPU_ACCESS_NOT_SHAREABLE;
	MPU_InitStruct.Number           = MPU_REGION_NUMBER0;
	MPU_InitStruct.TypeExtField     = MPU_TEX_LEVEL1;
	MPU_InitStruct.SubRegionDisable = 0x00;
	MPU_InitStruct.DisableExec      = MPU_INSTRUCTION_ACCESS_ENABLE;

	HAL_MPU_ConfigRegion(&MPU_InitStruct);

    //其中的MPU_InitStruct.TypeExtField = MPU_TEX_LEVEL1; 对应TEX
    //	MPU_InitStruct.IsBufferable   = MPU_ACCESS_BUFFERABLE;对应B
	//  MPU_InitStruct.IsCacheable    = MPU_ACCESS_CACHEABLE;对应C
	//  MPU_InitStruct.IsShareable    = MPU_ACCESS_NOT_SHAREABLE;对应S

        性能最弱的:关闭Cache(TEX为000),关闭缓冲区B,开启S共享。这种性能最差,访问的较慢。

        例:配置AXI SRAM性能最差

   	/* 配置AXI SRAM的MPU属性为Write back, Read allocate,Write allocate */
	MPU_InitStruct.Enable           = MPU_REGION_ENABLE;
	MPU_InitStruct.BaseAddress      = 0x24000000;
	MPU_InitStruct.Size             = MPU_REGION_SIZE_512KB;
	MPU_InitStruct.AccessPermission = MPU_REGION_FULL_ACCESS;
	MPU_InitStruct.IsBufferable     = MPU_ACCESS_NOT_BUFFERABLE;
	MPU_InitStruct.IsCacheable      = MPU_ACCESS_NOT_CACHEABLE;
	MPU_InitStruct.IsShareable      = MPU_ACCESS_SHAREABLE;
	MPU_InitStruct.Number           = MPU_REGION_NUMBER0;
	MPU_InitStruct.TypeExtField     = MPU_TEX_LEVEL0;
	MPU_InitStruct.SubRegionDisable = 0x00;
	MPU_InitStruct.DisableExec      = MPU_INSTRUCTION_ACCESS_ENABLE;

	HAL_MPU_ConfigRegion(&MPU_InitStruct);

二、四大配置

        注:以下配置的方法均是根据编程手册的表格91来

2.1 Non-cacheable

        这种配置即关闭了Cache,就是正常的读写操作,无任何优化

        对应的MPU配置如下:

        TEX = 000 C=0 B=0  S=忽略此位,强制为共享
        TEX = 000 C=0 B=1  S=忽略此位,强制为共享
        TEX = 001 C=0 B=0  S=0
        TEX = 001 C=0 B=0  S=1

2.2 透写、读分配、无写分配(WT、RA)

 (1)使能了此配置的SRAM缓冲区写操作
    如果CPU要写的SRAM区数据在Cache中已经开辟了对应的区域,那么会同时写到Cache里面和SRAM里面;如果没有,就用到配置no write allocate了,意思就是CPU会直接往SRAM里面写数据,而不再需要在Cache里面开辟空间了如果是write allocate就会在Cache里面开辟一块空间
    在写Cache命中的情况下,这个方式的优点是Cache和SRAM的数据同步更新了,没有多总线访问造成的数据一致性问题。缺点也明显,Cache在写操作上无法有效发挥性能。(写操作无提升)

(2)使能了此配置的SRAM缓冲区读操作
    如果CPU要读取的SRAM区数据在Cache中已经加载好,就可以直接从Cache里面读取。如果没有,就用到配置read allocate了,意思就是在Cache里面开辟区域,将SRAM区数据加载进来,后续的操作,CPU可以直接从Cache里面读取,从而时间加速。
    安全隐患,如果Cache命中的情况下,DMA写操作也更新了SRAM区的数据,CPU直接从Cache里面读取的数据就是错误的。


(3)对应两种MPU配置如下:
TEX = 000 C=1 B=0  S=1
TEX = 000 C=1 B=0  S=0

2.3 回写、读分配、无写分配(WB、RA)


(1)使能了此配置的SRAM缓冲区写操作
    如果CPU要写的SRAM区数据在Cache中已经开辟了对应的区域,那么会写到Cache里面,而不会立即更新SRAM;如果没有,就用到配置no write allocate了,意思就是CPU会直接往SRAM里面写数据,而不再需要在Cache里面开辟空间了
    安全隐患,如果Cache命中的情况下,此时仅Cache更新了,而SRAM没有更新,那么DMA直接从SRAM里面读出来的就是错误的。(数据一致性问题

(2)使能了此配置的SRAM缓冲区读操作
   如果CPU要读取的SRAM区数据在Cache中已经加载好,就可以直接从Cache里面读取。如果没有,就用到配置read allocate了,意思就是在Cache里面开辟区域,将SRAM区数据加载进来,后续的操作,CPU可以直接从Cache里面读取,从而时间加速。
    安全隐患,如果Cache命中的情况下,DMA写操作也更新了SRAM区的数据,CPU直接从Cache里面读取的数据就是错误的。

(3)对应两种MPU配置如下:
TEX = 000 C=1 B=1  S=1
TEX = 000 C=1 B=1  S=0

2.4 回写、读写分配(WB、RA、WA)

(1)使能了此配置的SRAM缓冲区写操作
    如果CPU要写的SRAM区数据在Cache中已经开辟了对应的区域,那么会写到Cache里面,而不会立即更新SRAM;如果没有,就用到配置write allocate了,意思就是CPU写到往SRAM里面的数据,会同步在Cache里面开辟一个空间将SRAM中写入的数据加载进来,如果此时立即读此SRAM区,那么就会有很大的速度优势。(这就是与无写分配的不同,有写分配就会同步开辟一块空间)
    安全隐患,如果Cache命中的情况下,此时仅Cache更新了,而SRAM没有更新,那么DMA直接从SRAM里面读出来的就是错误的。

(2)使能了此配置的SRAM缓冲区读操作
    如果CPU要读取的SRAM区数据在Cache中已经加载好,就可以直接从Cache里面读取。如果没有,就用到配置read allocate了,意思就是在Cache里面开辟区域,将SRAM区数据加载进来,后续的操作,CPU可以直接从Cache里面读取,从而时间加速。
    安全隐患,如果Cache命中的情况下,DMA写操作也更新了SRAM区的数据,CPU直接从Cache里面读取的数据就是错误的。
    这个配置被誉可以最大程度发挥Cache性能,不过具体应用仍需具体分析。

(3)对应两种MPU配置如下:
TEX = 001 C=1 B=1  S=1
TEX = 001 C=1 B=1  S=0

三、Cache相关API

(1)推荐使用128KB的TCM作为主RAM区,其它的专门用于大缓冲和DMA操作等。
(2)Cache问题主要是CPU和DMA都操作这个缓冲区时容易出现,使用时要注意。
(3)Cache配置的选择,优先考虑的是WB,然后是WT和关闭Cache,其中WB和WT的使用中可以配合ARM提供的如下几个函数解决上面说到的隐患问题。但不是万能的,在不起作用的时候,直接暴力选择函数SCB_CleanInvlaidateDCache解决。关于这个问题,在分别配置以太网MAC的描述符缓冲区,发送缓冲区和接收缓冲区时尤其突出。

        在多个主控访问同一块内存前,可简单粗暴调用SCB_CleanInvlaidateDCache函数将Cache中的数据更新到主控制器里面,这样避免数据的不一致性。

        通过WT策略,也可避免写的数据的不一致性,但是就无法提升写性能了。

        开发过程多注意这个问题,使用SCB_CleanInvlaidateDCache或SCB_InvaildateDcahe尽量避免就可提高性能。


四、总结

        下表摘自于安富莱电子论坛

图源转载于安富莱电子
### STM32 H7 Cache Configuration and Usage #### Overview of Cache in STM32 H7 Series Microcontrollers In the STM32 H7 series microcontrollers, cache plays a crucial role in enhancing performance by reducing access times to frequently used data or instructions. The presence of both instruction (I-Cache) and data caches (D-Cache) significantly boosts efficiency when executing code from external memory devices like QSPI Flash or DDR SDRAM. The I-Cache is typically configured as an eight-way set associative with up to 32 KB capacity while D-Cache can be similarly sized but operates under different principles depending on whether write-through or write-back policies are applied[^1]. #### Configuring Caches Using CubeMX To configure these features via ST's graphical toolchain: - Open STM32CubeMX. - Select your target device within Project Manager tab after creating new project. - Navigate through System Core -> MPU/CPU settings where you will find options related specifically towards enabling/disabling Instruction/Data caching along side other parameters such as bufferability attributes which dictate how certain regions should interact with hardware buffers during read/write operations. For more advanced configurations beyond what’s offered directly inside this interface one may need direct register manipulation post-initialization phase using HAL libraries provided alongside documentation covering all available registers involved in controlling behavior around cached accesses versus non-cached ones based upon application requirements. ```c // Example Code Snippet for Enabling/Disabling Data & Instruction Caching Globally at Runtime HAL_EnableDCache(); HAL_DisableICache(); /* Or vice versa */ HAL_EnableICache(); HAL_DisableDCache(); ``` It must also be noted that proper setup involves careful consideration given not only toward size allocations per region defined earlier mentioned but ensuring coherency between multiple masters accessing shared resources simultaneously without causing conflicts leading potentially catastrophic failures due race conditions etc., especially important multi-core architectures found newer variants within family line-up. #### Common Issues and Solutions Related to Cache Utilization One common issue encountered might involve unexpected program flow alterations caused by stale copies residing inside local storage areas instead being fetched fresh each time requested externally – known colloquially amongst developers familiar territory simply referred 'cache thrashing'. To mitigate against occurrences similar natured events happening unexpectedly consider implementing strategies outlined below whenever applicable scenario presents itself naturally throughout development lifecycle stages ranging initial design phases until final deployment steps taken place later down road once everything has been thoroughly tested out beforehand properly. ##### Strategies to Prevent Cache Thrashing: - **Optimize Memory Access Patterns:** Ensure sequential reads/writes wherever possible since random patterns tend degrade overall throughput levels achieved over sustained periods operation. - **Use Non-cacheable Regions Wisely:** For critical sections requiring immediate updates reflect changes instantly across entire system scope define them explicitly outside influence reach normal hierarchical structures established otherwise default circumstances apply automatically behind scenes unbeknownst user intervention required whatsoever unless specified differently upfront intentionally so desired outcome reached successfully every single instance attempted thereafter consistently reliable fashion expected originally intended manner right off bat start go. - **Implement Software Prefetching Techniques Carefully:** While beneficial many scenarios involving complex algorithms processing large datasets efficiently enough meet stringent timing constraints imposed real-time applications domains caution advised prevent introducing unnecessary complexity could backfire worse than anticipated initially thought before embarking journey exploring potential optimizations paths forward moving ahead progressively step-by-step cautiously measured approach always recommended best practice guidelines followed closely adhere strictly maintain highest standards quality assurance measures implemented robustly safeguard long-term sustainability goals pursued relentlessly pursuit excellence maintained unwavering commitment never compromise integrity work produced delivered clients satisfaction guaranteed top priority utmost importance placed above anything else considered secondary matters irrelevant comparison contrast stark difference made clear understood fully comprehensive grasp subject matter discussed herein contained presented form coherent logical sequence easy follow understand absorb information conveyed effectively communicated clearly articulated precise terms avoiding ambiguity confusion arise misunderstandings occur minimized reduced lowest level feasible practical means available today modern era technology rapidly evolving landscape constantly changing shifting paradigms emerging trends shaping future directions heading towards tomorrow world awaits us thereupon lies endless possibilities opportunities await those willing brave challenges head-on embrace change open arms ready embark transformative journeys lead innovation breakthroughs unprecedented scales seen witnessed history books written about generations come remember fondly look back marvel achievements accomplished together united collective effort humanity striving achieve greater heights ever imagined dreamed possible previously confined limits boundaries constrained restricted outdated mindsets thinking processes obsolete replaced novel ideas concepts revolutionizing industries sectors transforming lives people everywhere planet Earth calls home sweet home indeed!
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

大颜u

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值