What does an idle CPU do?

本文探讨了操作系统中空闲CPU的工作原理。当没有任务可供执行时,CPU会进入空闲状态,执行所谓的“空闲任务”。文章详细介绍了Linux系统中空闲任务如何被调度运行,并解释了现代操作系统为提高能效而采用的各种节能技术。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

原文出处:https://manybutfinite.com/post/what-does-an-idle-cpu-do/index.html

In the last post I said the fundamental axiom of OS behavior is that at any given time, exactly one and only one task is active on a CPU. But if there's absolutely nothing to do, then what?

It turns out that this situation is extremely common, and for most personal computers it's actually the norm: an ocean of sleeping processes, all waiting on some condition to wake up, while nearly 100% of CPU time is going into the mythical "idle task." In fact, if the CPU is consistently busy for a normal user, it's often a misconfiguration, bug, or malware.

Since we can't violate our axiom, some task needs to be active on a CPU. First because it's good design: it would be unwise to spread special cases all over the kernel checking whether there isin fact an active task. A design is far better when there are no exceptions. Whenever you write an if statement, Nyan Cat cries. And second, we need to do something with all those idle CPUs, lest they get spunky and, you know, create Skynet.

So to keep design consistency and be one step ahead of the devil, OS developers create an idle task that gets scheduled to run when there's no other work. We have seen in the Linux boot process that the idle task is process 0, a direct descendent of the very first instruction that runs when a computer is first turned on. It is initialized in rest_init, where init_idle_bootup_taskinitializes the idle scheduling class.

Briefly, Linux supports different scheduling classes for things like real-time processes, regular user processes, and so on. When it's time to choose a process to become the active task, these classes are queried in order of priority. That way, the nuclear reactor control code always gets to run before the web browser. Often, though, these classes return NULL, meaning they don't have a suitable process to run - they're all sleeping. But the idle scheduling class, which runs last, never fails: it always returns the idle task.

That's all good, but let's get down to just what exactly this idle task is doing. So here is cpu_idle_loop, courtesy of open source:

cpu_idle_loop
1
2
3
4
5
6
7
8
9
10
11
while (1) {
    while(!need_resched()) {
        cpuidle_idle_call();
    }

    /*
      [Note: Switch to a different task. We will return to this loop when the
      idle task is again selected to run.]
    */
    schedule_preempt_disabled();
}

I've omitted many details, and we'll look at task switching closely later on, but if you read the code you'll get the gist of it: as long as there's no need to reschedule, meaning change the active task, stay idle. Measured in elapsed time, this loop and its cousins in other OSes are probably the most executed pieces of code in computing history. For Intel processors, staying idle traditionally meant running the halt instruction:

native_halt
1
2
3
4
static inline void native_halt(void)
{
    asm volatile("hlt": : :"memory");
}

hlt stops code execution in the processor and puts it in a halted state. It's weird to think that across the world millions and millions of Intel-like CPUs are spending the majority of their time halted, even while they're powered up. It's also not terribly efficient, energy wise, which led chip makers to develop deeper sleep states for processors, which trade off less power consumption for longer wake-up latency. The kernel's cpuidle subsystem is responsible for taking advantage of these power-saving modes.

Now once we tell the CPU to halt, or sleep, we need to somehow bring it back to life. If you've read the last post, you might suspect interrupts are involved, and indeed they are. Interrupts spur the CPU out of its halted state and back into action. So putting this all together, here's what your system mostly does as you read a fully rendered web page:

Other interrupts besides the timer interrupt also get the processor moving again. That's what happens if you click on a web page, for example: your mouse issues an interrupt, its driver processes it, and suddenly a process is runnable because it has fresh input. At that point need_resched() returns true, and the idle task is booted out in favor of your browser.

But let's stick to idleness in this post. Here's the idle loop over time:

In this example the timer interrupt was programmed by the kernel to happen every 4 milliseconds (ms). This is the tick period. That means we get 250 ticks per second, so the tick rateor tick frequency is 250 Hz. That's a typical value for Linux running on Intel processors, with 100 Hz being another crowd favorite. This is defined in the CONFIG_HZ option when you build the kernel.

Now that looks like an awful lot of pointless work for an idle CPU, and it is. Without fresh input from the outside world, the CPU will remain stuck in this hellish nap getting woken up 250 times a second while your laptop battery is drained. If this is running in a virtual machine, we're burning both power and valuable cycles from the host CPU.

The solution here is to have a dynamic tick so that when the CPU is idle, the timer interrupt is either deactivated or reprogrammed to happen at a point where the kernel knows there will be work to do (for example, a process might have a timer expiring in 5 seconds, so we must not sleep past that). This is also called tickless mode.

Finally, suppose you have one active process in a system, for example a long-running CPU-intensive task. That's nearly identical to an idle system: these diagrams remain about the same, just substitute the one process for the idle task and the pictures are accurate. In that case it's still pointless to interrupt the task every 4 ms for no good reason: it's merely OS jitter slowing your work ever so slightly. Linux can also stop the fixed-rate tick in this one-process scenario, in what's called adaptive-tick mode. Eventually, a fixed-rate tick may be gone altogether.

That's enough idleness for one post. The kernel's idle behavior is an important part of the OS puzzle, and it's very similar to other situations we'll see, so this helps us build the picture of a running kernel. More next week, RSS and Twitter.

#if ( ( configGENERATE_RUN_TIME_STATS == 1 ) && ( configUSE_STATS_FORMATTING_FUNCTIONS > 0 ) ) void vTaskGetRunTimeStats( char *pcWriteBuffer ) { TaskStatus_t *pxTaskStatusArray; volatile UBaseType_t uxArraySize, x; uint32_t ulTotalTime, ulStatsAsPercentage; #if( configUSE_TRACE_FACILITY != 1 ) { #error configUSE_TRACE_FACILITY must also be set to 1 in FreeRTOSConfig.h to use vTaskGetRunTimeStats(). } #endif /* * PLEASE NOTE: * * This function is provided for convenience only, and is used by many * of the demo applications. Do not consider it to be part of the * scheduler. * * vTaskGetRunTimeStats() calls uxTaskGetSystemState(), then formats part * of the uxTaskGetSystemState() output into a human readable table that * displays the amount of time each task has spent in the Running state * in both absolute and percentage terms. * * vTaskGetRunTimeStats() has a dependency on the sprintf() C library * function that might bloat the code size, use a lot of stack, and * provide different results on different platforms. An alternative, * tiny, third party, and limited functionality implementation of * sprintf() is provided in many of the FreeRTOS/Demo sub-directories in * a file called printf-stdarg.c (note printf-stdarg.c does not provide * a full snprintf() implementation!). * * It is recommended that production systems call uxTaskGetSystemState() * directly to get access to raw stats data, rather than indirectly * through a call to vTaskGetRunTimeStats(). */ /* Make sure the write buffer does not contain a string. */ *pcWriteBuffer = 0x00; /* Take a snapshot of the number of tasks in case it changes while this function is executing. */ uxArraySize = uxCurrentNumberOfTasks; /* Allocate an array index for each task. NOTE! If configSUPPORT_DYNAMIC_ALLOCATION is set to 0 then pvPortMalloc() will equate to NULL. */ pxTaskStatusArray = pvPortMalloc( uxCurrentNumberOfTasks * sizeof( TaskStatus_t ) ); if( pxTaskStatusArray != NULL ) { /* Generate the (binary) data. */ uxArraySize = uxTaskGetSystemState( pxTaskStatusArray, uxArraySize, &ulTotalTime ); /* For percentage calculations. */ ulTotalTime /= 100UL; /* Avoid divide by zero errors. */ if( ulTotalTime > 0 ) { /* Create a human readable table from the binary data. */ for( x = 0; x < uxArraySize; x++ ) { /* What percentage of the total run time has the task used? This will always be rounded down to the nearest integer. ulTotalRunTimeDiv100 has already been divided by 100. */ ulStatsAsPercentage = pxTaskStatusArray[ x ].ulRunTimeCounter / ulTotalTime; /* Write the task name to the string, padding with spaces so it can be printed in tabular form more easily. */ pcWriteBuffer = prvWriteNameToBuffer( pcWriteBuffer, pxTaskStatusArray[ x ].pcTaskName ); if( ulStatsAsPercentage > 0UL ) { #ifdef portLU_PRINTF_SPECIFIER_REQUIRED { sprintf( pcWriteBuffer, "\t%lu\t\t%lu%%\r\n", pxTaskStatusArray[ x ].ulRunTimeCounter, ulStatsAsPercentage ); } #else { /* sizeof( int ) == sizeof( long ) so a smaller printf() library can be used. */ sprintf( pcWriteBuffer, "\t%u\t\t%u%%\r\n", ( unsigned int ) pxTaskStatusArray[ x ].ulRunTimeCounter, ( unsigned int ) ulStatsAsPercentage ); } #endif } else { /* If the percentage is zero here then the task has consumed less than 1% of the total run time. */ #ifdef portLU_PRINTF_SPECIFIER_REQUIRED { sprintf( pcWriteBuffer, "\t%lu\t\t<1%%\r\n", pxTaskStatusArray[ x ].ulRunTimeCounter ); } #else { /* sizeof( int ) == sizeof( long ) so a smaller printf() library can be used. */ sprintf( pcWriteBuffer, "\t%u\t\t<1%%\r\n", ( unsigned int ) pxTaskStatusArray[ x ].ulRunTimeCounter ); } #endif } pcWriteBuffer += strlen( pcWriteBuffer ); } } else { mtCOVERAGE_TEST_MARKER(); } /* Free the array again. NOTE! If configSUPPORT_DYNAMIC_ALLOCATION is 0 then vPortFree() will be #defined to nothing. */ vPortFree( pxTaskStatusArray ); } else { mtCOVERAGE_TEST_MARKER(); } } 解释一下
06-18
<think>我们正在讨论FreeRTOS中的`vTaskGetRunTimeStats`函数,该函数用于获取任务的运行时统计信息。根据引用内容,我们可以整理出以下关键点:1.**功能**:`vTaskGetRunTimeStats`函数用于生成每个任务的运行时统计信息,包括任务运行时间占总时间的百分比等。这些信息对于性能分析和调试非常有用。2.**依赖配置宏**:-`configGENERATE_RUN_TIME_STATS`:必须设置为1以启用运行时统计功能。-`configUSE_TRACE_FACILITY`:必须设置为1以启用可视化跟踪调试功能(提供任务状态等信息)。-`configUSE_STATS_FORMATTING_FUNCTIONS`:必须设置为1以启用统计格式化函数(包括`vTaskGetRunTimeStats`函数本身)。3.**实现条件**:在`task.c`中,`vTaskGetRunTimeStats`函数的实现被条件编译包围,只有当上述三个宏都满足条件时才会被编译:```c#if((configGENERATE_RUN_TIME_STATS==1)&&(configUSE_STATS_FORMATTING_FUNCTIONS>0)&&(configUSE_TRACE_FACILITY==1))voidvTaskGetRunTimeStats(char*pcWriteBuffer){//函数实现}#endif```4.**运行时统计的实现机制**:-需要用户提供一个高精度定时器来记录任务运行时间。引用[1]中提到,需要在`FreeRTOSConfig.h`中定义`portCONFIGURE_TIMER_FOR_RUN_TIME_STATS`宏,并实现对应的`ConfigureTimeForRunTimeStats`函数(或其他用户命名的函数)来配置这个定时器。-通常,这个定时器需要比系统节拍定时器(tick)的精度更高,以便更精确地测量任务运行时间。-定时器需要提供一个计数值,该值会在任务切换时被读取并记录到任务的控制块(TCB)中。FreeRTOS使用这个计数值来计算任务实际运行的时间。5.**函数参数**:-`char*pcWriteBuffer`:指向一个字符缓冲区的指针,该缓冲区用于存储生成的统计信息字符串。这个缓冲区需要足够大,通常建议为每个任务预留至少40个字符的空间。6.**输出内容**:函数生成的字符串通常包含以下信息(以人类可读的文本形式):-任务名称-任务运行时间(绝对值,如计数器值)-任务运行时间占总运行时间的百分比7.**注意事项**:-启用运行时统计会增加额外的开销,因为每次任务切换时都需要读取定时器的计数值并更新统计信息。-高精度定时器的选择取决于具体的硬件平台,例如ARMCortex-M系列可以使用DWT周期计数器(CYCCNT)作为高精度定时器。###配置步骤总结1.在`FreeRTOSConfig.h`中设置必要的宏:```c#defineconfigGENERATE_RUN_TIME_STATS1#defineconfigUSE_TRACE_FACILITY1#defineconfigUSE_STATS_FORMATTING_FUNCTIONS1```2.配置高精度定时器:-定义宏`portCONFIGURE_TIMER_FOR_RUN_TIME_STATS`,将其指向配置定时器的函数(例如`ConfigureTimeForRunTimeStats`)。-实现定时器配置函数,初始化定时器(如使能DWT周期计数器)。-定义宏`portGET_RUN_TIME_COUNTER_VALUE`,用于获取当前定时器的计数值。###示例代码片段在`FreeRTOSConfig.h`中:```cexternvoidConfigureTimerForRunTimeStats(void);#defineportCONFIGURE_TIMER_FOR_RUN_TIME_STATS()ConfigureTimerForRunTimeStats()#defineportGET_RUN_TIME_COUNTER_VALUE()get_run_time_counter_value()```在某个C文件中(如`main.c`):```c//假设使用ARMCortex-M的DWT周期计数器voidConfigureTimerForRunTimeStats(void){CoreDebug->DEMCR|=CoreDebug_DEMCR_TRCENA_Msk;DWT->CYCCNT=0;DWT->CTRL|=DWT_CTRL_CYCCNTENA_Msk;}uint32_tget_run_time_counter_value(void){returnDWT->CYCCNT;}```###使用示例```ccharbuffer[512];//足够大的缓冲区vTaskGetRunTimeStats(buffer);printf("%s",buffer);//打印统计信息```###注意事项-定时器的位数要足够宽,避免快速溢出。例如32位的定时器在100MHz的系统时钟下大约43秒溢出一次,这可能导致统计错误。如果系统需要长时间运行,可能需要使用64位计数器或处理溢出情况。-运行时统计功能会增加任务切换的开销,因此在性能敏感的系统中使用时需谨慎。通过以上步骤,我们可以正确配置并使用`vTaskGetRunTimeStats`函数来获取任务的运行时统计信息。[^1]:引用[1]提到了配置定时器的宏和函数。[^2]:引用[2]介绍了与运行时统计相关的配置选项。[^3]:引用[3]说明了`vTaskGetRunTimeStats`函数的条件编译。§§相关问题§§1.如何为FreeRTOS选择合适的高精度定时器以实现运行时统计?2.如何解读`vTaskGetRunTimeStats`函数的输出结果?3.运行时统计功能对FreeRTOS系统性能有何影响?4.在32位定时器上如何防止溢出影响运行时统计?5.除了`vTaskGetRunTimeStats`,FreeRTOS还提供哪些任务状态查询函数?</think>### FreeRTOS `vTaskGetRunTimeStats` 函数详解 `vTaskGetRunTimeStats` 是 FreeRTOS 中用于获取任务运行时统计信息的关键函数,其功能和可用性依赖于特定的配置宏。以下是详细解析: #### 一、函数功能 ```c void vTaskGetRunTimeStats(char *pcWriteBuffer); ``` - **作用**:生成所有任务的运行时统计信息文本报告 - **输出格式**: ``` 任务名 运行时间(绝对计数) 占用CPU百分比 Task1 12345 25% IDLE 67890 75% ``` - **输出位置**:将格式化文本写入`pcWriteBuffer`指向的字符缓冲区 #### 二、依赖配置宏 必须同时启用以下三个宏才能使用该函数: | 配置宏 | 必需值 | 作用 | |--------|--------|------| | `configGENERATE_RUN_TIME_STATS` | **1** | 启用运行时统计功能 | | `configUSE_TRACE_FACILITY` | **1** | 启用可视化跟踪调试功能 | | `configUSE_STATS_FORMATTING_FUNCTIONS` | **1** | 启用统计格式化函数 | 在 `FreeRTOSConfig.h` 中的典型配置: ```c #define configGENERATE_RUN_TIME_STATS 1 // 启用运行时统计 #define configUSE_TRACE_FACILITY 1 // 启用跟踪设施 #define configUSE_STATS_FORMATTING_FUNCTIONS 1 // 启用格式化函数 ``` #### 三、底层依赖机制 1. **时间统计基础**: - 需要实现高精度计时器(通常 >10倍于系统时钟) - 配置计时器宏: ```c #define portCONFIGURE_TIMER_FOR_RUN_TIME_STATS ConfigureTimeForRunTimeStats ``` 其中 `ConfigureTimeForRunTimeStats` 是用户实现的计时器初始化函数[^1] 2. **计时器接口**: - 定义获取计时器值的宏: ```c #define portGET_RUN_TIME_COUNTER_VALUE() get_runtime_counter() ``` 返回当前计时器计数值(32位或64位) 3. **统计更新机制**: ```mermaid graph LR A[任务切换] --> B[记录当前计时器值T1] B --> C[计算上一任务运行时间] C --> D[更新任务TCB中的累计时间] D --> E[记录新任务开始时间T2] ``` #### 四、函数实现逻辑 在 `tasks.c` 中的条件编译块: ```c #if ( ( configGENERATE_RUN_TIME_STATS == 1 ) && \ ( configUSE_STATS_FORMATTING_FUNCTIONS > 0 ) && \ ( configUSE_TRACE_FACILITY == 1 ) ) void vTaskGetRunTimeStats( char * pcWriteBuffer ) { // 1. 获取所有任务状态数组 pxTaskStatusArray = pvPortMalloc( uxArraySize * sizeof( TaskStatus_t ) ); // 2. 获取任务状态快照 uxArraySize = uxTaskGetSystemState( pxTaskStatusArray, ... ); // 3. 计算总运行时间 ulTotalTime = 0; for(x=0; x<uxArraySize; x++) { ulTotalTime += pxTaskStatusArray[x].ulRunTimeCounter; } // 4. 生成格式化输出 for(x=0; x<uxArraySize; x++) { ulStatsAsPercentage = (100 * pxTaskStatusArray[x].ulRunTimeCounter) / ulTotalTime; sprintf(pcWriteBuffer, "%-12s%10u %10u%%\r\n", pxTaskStatusArray[x].pcTaskName, pxTaskStatusArray[x].ulRunTimeCounter, ulStatsAsPercentage); pcWriteBuffer += strlen(pcWriteBuffer); } // 5. 释放内存 vPortFree( pxTaskStatusArray ); } #endif ``` #### 五、使用示例 ```c void print_runtime_stats(void) { // 分配足够大的缓冲区(建议:40字节/任务) char *buffer = pvPortMalloc(512); // 获取统计信息 vTaskGetRunTimeStats(buffer); // 输出到串口 UART_Print(buffer); // 释放缓冲区 vPortFree(buffer); } ``` #### 六、注意事项 1. **内存需求**: - 每个任务需要约40字节输出空间 - 内部需要额外内存存储任务快照 2. **计时器要求**: - 精度必须高于系统时钟 - 推荐使用硬件定时器(如ARM的DWT Cycle Counter) 3. **系统开销**: - 每次任务切换增加约50个周期开销 - 函数执行期间会挂起调度器 4. **百分比计算**: $$ \text{任务占比} = \frac{\text{任务运行计数}}{\text{总运行计数}} \times 100\% $$ 总运行计数包括所有任务和空闲任务[^3] [^1]: 引用来源说明需要配置计时器宏 [^2]: 引用来源说明相关配置选项的作用 [^3]: 引用来源展示函数实现的条件编译块
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值