Working Set, Paged Pool and Non-paged pool

本文详细解析了工作集、非换页池与分页池的概念及其在内存管理中的作用,通过代码示例展示了如何获取系统进程的内存使用情况,帮助开发者了解和优化内存使用。

Working Set

The working set of a process is the set of pages in the virtual address space of the process that are currently resident in physical memory. The working set contains only pageable memory allocations; nonpageable memory allocations such as Address Windowing Extensions (AWE) or large page allocations are not included in the working set.

When a process references pageable memory that is not currently in its working set, a page fault occurs. The system page fault handler attempts to resolve the page fault and, if it succeeds, the page is added to the working set. (Accessing AWE or large page allocations never causes a page fault, because these allocations are not pageable .)

查看工作集可以知道进程稳定运行时需要使用多少的物理内存。


Nonpaged Pool

The kernel and device drivers use nonpaged pool to store data that might be accessed when the system can’t handle page faults. The kernel enters such a state when it executes interrupt service routines (ISRs) and deferred procedure calls (DPCs), which are functions related to hardware interrupts. Page faults are also illegal when the kernel or a device driver acquires a spin lock, which, because they are the only type of lock that can be used within ISRs and DPCs, must be used to protect data structures that are accessed from within ISRs or DPCs and either other ISRs or DPCs or code executing on kernel threads. Failure by a driver to honor these rules results in the most common crash code, IRQL_NOT_LESS_OR_EQUAL.

Nonpaged pool is therefore always kept present in physical memory and nonpaged pool virtual memory is assigned physical memory. Common system data structures stored in nonpaged pool include the kernel and objects that represent processes and threads, synchronization objects like mutexes, semaphores and events, references to files, which are represented as file objects, and I/O request packets (IRPs), which represent I/O operations.

非换页池是系统中极其重要的资源,非换页池的内存紧张会导致系统各种异常行为,例如http.sys会拒绝请求,从而IIS无法对外提供服务,C:\Windows\System32\LogFiles\HTTPERR 中会记录Connections_Refused。代表内核 NonPagedPool 内存已下降到 20MB 以下,http.sys 已停止接收新连接。


Paged Pool

Paged pool, on the other hand, gets its name from the fact that Windows can write the data it stores to the paging file, allowing the physical memory it occupies to be repurposed. Just as for user-mode virtual memory, when a driver or the system references paged pool memory that’s in the paging file, an operation called a page fault occurs, and the memory manager reads the data back into physical memory. The largest consumer of paged pool, at least on Windows Vista and later, is typically the Registry, since references to registry keys and other registry data structures are stored in paged pool. The data structures that represent memory mapped files, called sections internally, are also stored in paged pool.

Device drivers use the ExAllocatePoolWithTag API to allocate nonpaged and paged pool, specifying the type of pool desired as one of the parameters. Another parameter is a 4-byte Tag, which drivers are supposed to use to uniquely identify the memory they allocate, and that can be a useful key for tracking down drivers that leak pool, as I’ll show later.


最简单的查看系统进程相应的内存信息的方式是打开task manager - 然后选择working set, paged pool, non-paged pool列来查看每个进程的使用情况。


下面一段程序也可以输出系统运行的进程中相应的working set, paged pool, non-paged pool信息。

#include <windows.h>
#include <stdio.h>
#include <psapi.h>

// To ensure correct resolution of symbols, add Psapi.lib to TARGETLIBS
// and compile with -DPSAPI_VERSION=1

void PrintMemoryInfo( DWORD processID )
{
    HANDLE hProcess;
    PROCESS_MEMORY_COUNTERS pmc;

    // Print the process identifier.

    printf( "\nProcess ID: %u\n", processID );

    // Print information about the memory usage of the process.

    hProcess = OpenProcess(  PROCESS_QUERY_INFORMATION |
                                    PROCESS_VM_READ,
                                    FALSE, processID );
    if (NULL == hProcess)
        return;

    if ( GetProcessMemoryInfo( hProcess, &pmc, sizeof(pmc)) )
    {
        printf( "\tPageFaultCount: 0x%08X\n", pmc.PageFaultCount );
        printf( "\tPeakWorkingSetSize: 0x%08X\n", 
                  pmc.PeakWorkingSetSize );
        printf( "\tWorkingSetSize: 0x%08X\n", pmc.WorkingSetSize );
        printf( "\tQuotaPeakPagedPoolUsage: 0x%08X\n", 
                  pmc.QuotaPeakPagedPoolUsage );
        printf( "\tQuotaPagedPoolUsage: 0x%08X\n", 
                  pmc.QuotaPagedPoolUsage );
        printf( "\tQuotaPeakNonPagedPoolUsage: 0x%08X\n", 
                  pmc.QuotaPeakNonPagedPoolUsage );
        printf( "\tQuotaNonPagedPoolUsage: 0x%08X\n", 
                  pmc.QuotaNonPagedPoolUsage );
        printf( "\tPagefileUsage: 0x%08X\n", pmc.PagefileUsage ); 
        printf( "\tPeakPagefileUsage: 0x%08X\n", 
                  pmc.PeakPagefileUsage );
    }

    CloseHandle( hProcess );
}

int main( void )
{
    // Get the list of process identifiers.

    DWORD aProcesses[1024], cbNeeded, cProcesses;
    unsigned int i;

    if ( !EnumProcesses( aProcesses, sizeof(aProcesses), &cbNeeded ) )
    {
        return 1;
    }

    // Calculate how many process identifiers were returned.

    cProcesses = cbNeeded / sizeof(DWORD);

    // Print the memory usage for each process

    for ( i = 0; i < cProcesses; i++ )
    {
        PrintMemoryInfo( aProcesses[i] );
    }

    return 0;
}


<think>我们正在处理两个主要问题:1.`__bad_stack()`问题的解决方案2."Registerr0non-pagedmemory"的具体含义和可能原因首先,针对`__bad_stack()`问题,根据之前的讨论,我们已经有了系统的调试方法。但是,用户提供了新的引用内容,这些引用涉及底层ARM汇编代码、printk机制、内存管理中的位图以及引导参数块。我们需要结合这些引用内容来丰富我们的分析。###一、解决`__bad_stack()`问题`__bad_stack()`通常发生在内核栈损坏或栈指针无效时。根据引用[1]中的汇编代码片段,它展示了如何读取复位状态寄存器(RST_STAT)来判断复位原因(如睡眠唤醒)。这种底层操作可能与栈问题相关,特别是在系统从低功耗状态恢复时,如果栈状态恢复不正确,可能导致栈错误。####扩展调试步骤:1.**检查低功耗状态恢复路径**:如果`__bad_stack()`发生在系统从睡眠状态唤醒时,需要检查唤醒路径上的栈操作。引用[1]中的代码:```asmldrr0,=(ELFIN_CLOCK_POWER_BASE+RST_STAT_OFFSET)ldrr1,[r0]bicr1,r1,#0xfff6ffffcmpr1,#0x10000beqwakeup_reset_precmpr1,#0x80000beqwakeup_reset_from_didle```这里根据复位状态跳转到不同的唤醒处理函数。如果栈在这些函数中被错误配置(例如,未正确恢复栈指针),可能触发`__bad_stack()`。-在`wakeup_reset_pre`和`wakeup_reset_from_didle`函数中添加栈检查逻辑,例如:```cif(current_stack_pointer()<current_thread_info()->stack_start||current_stack_pointer()>current_thread_info()->stack_end){panic("Corruptedstackduringwakeup!");}```-使用KGDB在唤醒路径上设置断点,检查栈指针(SP)的值是否在预期范围内[^1]。2.**结合日志机制**:引用[2]提到`printk`与`printf`的区别:`printk`写入内核环形缓冲区,由`klogd`管理。确保在调试时获取完整的日志:```bashdmesg-H#查看带时间戳的内核日志```搜索`__bad_stack`出现的上下文,重点关注复位事件(如`PM:suspendexit`)或低功耗事件相关的日志。3.**内存位图与栈分配的关系**:引用[3]讨论了内存分配的位图机制。内核栈是通过`vmalloc`或`alloc_thread_stack_node`分配的,如果位图管理出错(如分配单元大小设置不当),可能导致栈内存分配异常:-检查内核配置`CONFIG_VMAP_STACK`(虚拟映射栈)是否启用。如果启用,栈通过虚拟连续内存分配,位图错误可能导致栈映射错误。-使用`kmemleak`检测栈内存泄漏:```bashmount-tdebugfsdebugfs/sys/kernel/debugechoscan>/sys/kernel/debug/kmemleakcat/sys/kernel/debug/kmemleak```###二、分析"Registerr0non-pagedmemory"此错误信息通常出现在ARM架构中,表示寄存器r0指向的地址位于非分页内存区域(即未映射到物理内存的虚拟地址)。####含义解析:-**非分页内存(Non-pagedmemory)**:在MMU(内存管理单元)开启后,所有内存访问需通过页表转换。如果访问的虚拟地址没有对应的页表项(即未映射),则触发缺页异常。但某些特殊内存区域(如内核代码段)被标记为"non-paged",意为无需分页(常驻物理内存)。-**Registerr0指向非分页内存**:此错误通常是因为程序试图访问一个无效地址(例如NULL指针或未映射地址),而该地址恰好落在非分页区域。####可能原因:1.**空指针解引用**:r0可能为0(ARM中0地址通常是非分页的),尝试访问`*r0`时触发错误。2.**错误的函数指针调用**:函数指针存储在r0中,但其指向的地址无效。例如:```asmldrr0,[pc,#offset]@加载函数指针到r0blxr0@跳转到r0,如果r0无效则崩溃```3.**内存损坏**:堆或栈内存被覆盖,导致r0的值被篡改。例如:-栈溢出(覆盖了保存的r0寄存器值)-堆缓冲区溢出(覆盖了包含函数指针的结构体)4.**未初始化的寄存器**:汇编代码中未正确初始化r0就使用它。####调试方法:1.**定位触发点**:当错误发生时,内核会打印异常信息(如"UnabletohandlekernelpagingrequestatvirtualaddressXXXX")。结合`__bad_stack`的栈回溯,找到触发异常的指令地址。2.**检查寄存器状态**:在异常日志中,所有寄存器值(包括r0)会被打印。分析r0的值:-如果r0=0,通常是空指针问题。-如果r0是非法地址(如0xdeadbeef),可能内存被污染。3.**使用KASAN检测内存错误**:启用KASAN(引用[3]提到的位图机制类似,但KASAN使用影子内存跟踪有效性):```kconfigCONFIG_KASAN=yCONFIG_KASAN_INLINE=y```KASAN可捕获非法内存访问(如越界或释放后使用),并打印详细的错误报告。4.**检查页表映射**:如果r0指向的地址是有效的(如内核代码地址),但页表未建立映射,需要检查:-MMU初始化是否完成(在早期启动阶段,若在MMU未开启时访问虚拟地址会崩溃)-是否错误访问了保留区域(如通过`ioremap`映射的设备内存)###三、综合解决方案####步骤总结:1.**捕获日志**:通过`dmesg`获取完整内核日志,定位`__bad_stack`和"Registerr0non-paged"的错误上下文。2.**栈问题专项检查**:-启用`CONFIG_DEBUG_STACK_USAGE`检测栈溢出。-在低功耗唤醒路径(如`wakeup_reset_pre`)添加栈验证代码。3.**内存错误检测**:-启用KASAN检测内存越界。-使用`kmemleak`检测内存泄漏。4.**寄存器r0分析**:-在异常日志中记录r0的值,反汇编触发异常的指令。-若r0来自函数指针,检查其来源(堆/栈)是否被损坏。####示例修复:假设日志显示错误发生在唤醒路径:```c//在arch/arm/mach-foo/low_power.S的唤醒函数中修复wakeup_reset_from_didle://恢复栈指针前确保其有效性ldrsp,[saved_sp]//假设saved_sp保存了睡眠前的栈指针//添加检查:sp必须在当前线程栈范围内ldrr0,=current_thread_infoldrr1,[r0]ldrr2,[r1,#TI_STACK_START]//栈起始地址addr3,r2,#THREAD_SIZE//栈结束地址cmpsp,r2blobad_spcmpsp,r3bhibad_spbwakeup_donebad_sp:ldrsp,=emergency_stack//切换到紧急栈blreport_stack_error//记录错误wakeup_done://继续唤醒流程```###四、相关问题§§相关问题§§1.如何在ARM低功耗模式唤醒路径中调试栈问题?2.KASAN如何检测非分页内存访问错误?3.如何在内核中捕获并处理未映射地址访问?4.内核栈分配机制(`CONFIG_VMAP_STACK`)如何影响内存位图管理?[^3]5.在ARM汇编中如何安全地使用寄存器避免非分页内存访问?</think>### 如何解决 `__bad_stack()` 问题及分析 "Register r0 non-paged memory" #### 一、`__bad_stack()` 问题解决方案 `__bad_stack()` 是 ARM 架构中栈指针无效或栈损坏时的安全处理函数。解决方案需结合硬件检测和软件调试: 1. **栈溢出检测** - 启用内核栈保护配置: ```kconfig CONFIG_VMAP_STACK=y # 使用虚拟映射栈隔离错误 CONFIG_STACKPROTECTOR=y # 栈金丝雀保护 ``` - 在可疑函数添加栈检查: ```c if (current_stack_pointer() < current_thread_info()->stack_start || current_stack_pointer() > current_thread_info()->stack_end) { panic("Stack corruption detected!"); } ``` 2. **中断上下文检查** - 验证中断处理函数是否超过栈分配(引用[1]相关): ```c // 中断处理函数声明时添加栈限制属性 irq_handler_t __attribute__((stack_protect)) my_irq_handler(...); ``` 3. **低功耗状态恢复验证**(针对引用[1]场景) ```c if (is_wakeup_from_suspend()) { // 检测睡眠唤醒状态 validate_stack_after_resume(); // 增加栈恢复检查 } ``` 4. **内存损坏检测** - 启用 KASAN 并重点关注栈区域: ```bash # KASAN 启动参数 kasan=stack # 强化栈检测 ``` #### 二、"Register r0 non-paged memory" 分析 该错误表示 ARM 寄存器 R0 指向了未分页的内存区域,常见于: ##### 可能原因 1. **非法地址访问** - R0 包含未映射的物理地址(如 0x0) - 访问保留内存区域(如硬件寄存器空间) 2. **MMU 配置错误** - 分页表未正确建立(尤其睡眠唤醒时,引用[1]场景) - 内存类型标记错误(如将设备内存标记为普通 RAM) 3. **内存管理位图错误**(引用[3]相关) - 位图分配单元大小不当导致地址计算错误 - 位图中错误标记了非分页区域为可用 4. **引导参数错误**(引用[4]相关) - `LOADER_PARAMETER_BLOCK` 中内存区域描述错误 - ARC 命名规范解析失败(如 `multi(0)disk(0)...`) ##### 调试步骤 1. **获取寄存器快照** ```bash crash> p/x $r0 # 查看 R0 值 crash> vtop $r0 # 转换虚拟地址到物理地址 ``` 2. **检查内存映射** ```bash # 查看 /proc/iomem 确认地址归属 cat /proc/iomem | grep -C 5 <R0_value> ``` 3. **验证分页表** ```c // 内核中添加页表检查 if (!virt_addr_valid(ptr)) { pr_emerg("Non-paged access at %p\n", ptr); } ``` 4. **位图分配诊断**(引用[3]) ```c debug_show_mem_bitmap(); // 自定义位图诊断函数 ``` #### 三、综合解决方案 1. **睡眠唤醒场景专项处理** ```c wakeup_reset_from_didle: ldr sp, =SAFE_STACK_ADDR // 使用安全栈 bl check_mmu_config // 验证MMU配置 bl restore_critical_data // 关键数据恢复 ``` 2. **建立内存访问白名单** ```c #define ALLOWED_NONPAGED_RANGE 0xFE000000, 0xFE020000 if (r0_in_nonpaged_range() && !(r0 >= ALLOWED_NONPAGED_RANGE)) { panic("Illegal non-paged access"); } ``` 3. **引导参数加固**(引用[4]) ```c validate_arc_string(lp->ArcName); // ARC名称校验 sanitize_memory_map(lp->MemoryMap);// 内存映射消毒 ``` #### 四、验证方法 ```c // 测试模块模拟非分页访问 static int __init test_init(void) { void __iomem *addr = ioremap(0x0, 4); // 映射非法地址 writel(0x12345678, addr); // 触发错误 } module_init(test_init); ``` ### 相关问题
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值