相关内存调试

本文详细介绍了如何获取Linux内核中的e820表、memblock信息、每个Zone的内存分布、内存在node上的分布以及zonelist的顺序。通过启动日志和特定的调试选项,可以观察到内存的不同层面的分配和使用情况。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

获得e820表

启动日志中包含了e820的相关信息,这段信息在setup_memory_map()中e820_print_map()打印。

dimes | grep e820

就可以得到

[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d7ff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009d800-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000ba5b1fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000ba5b2000-0x00000000ba5b8fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000ba5b9000-0x00000000bad8dfff] usable
[    0.000000] BIOS-e820: [mem 0x00000000bad8e000-0x00000000bafb5fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000bafb6000-0x00000000ca8a1fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000ca8a2000-0x00000000ca939fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ca93a000-0x00000000ca977fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000ca978000-0x00000000caa3efff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000caa3f000-0x00000000caffefff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000cafff000-0x00000000caffffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000cb800000-0x00000000cf9fffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed03fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000022f5fffff] usable

可以注意,只有标示为usable的内存才是可以用的。

获得memblock信息

memblock的信息默认是不会打印的,当然有时候也会有一些擦边的。如果需要看到完整的memblock信息,需要打开memblock_debug。方法是内核参数上添加“memblock=debug”。

dmesg | grep -A 15 "MEMBLOCK configuration"

注意,在x86平台上,我们一共得到了两次结果。

第一次在e820__memblock_setup()函数中。

[    0.000000] MEMBLOCK configuration:
[    0.000000]  memory size = 0x00000001f9c4e800 reserved size = 0x0000000014aab27c
[    0.000000]  memory.cnt  = 0x7
[    0.000000]  memory[0x0] [0x0000000000001000-0x000000000009cfff], 0x000000000009c000 bytes flags: 0x0
[    0.000000]  memory[0x1] [0x0000000000100000-0x00000000ba5b1fff], 0x00000000ba4b2000 bytes flags: 0x0
[    0.000000]  memory[0x2] [0x00000000ba5b9000-0x00000000bad8dfff], 0x00000000007d5000 bytes flags: 0x0
[    0.000000]  memory[0x3] [0x00000000bafb6000-0x00000000ca8a1fff], 0x000000000f8ec000 bytes flags: 0x0
[    0.000000]  memory[0x4] [0x00000000ca93a000-0x00000000ca977fff], 0x000000000003e000 bytes flags: 0x0
[    0.000000]  memory[0x5] [0x00000000cafff000-0x00000000caffffff], 0x0000000000001000 bytes flags: 0x0
[    0.000000]  memory[0x6] [0x0000000100000000-0x000000022f5fffff], 0x000000012f600000 bytes flags: 0x0
[    0.000000]  reserved.cnt  = 0x4
[    0.000000]  reserved[0x0]   [0x00000000000fd450-0x00000000000fd6bb], 0x000000000000026c bytes flags: 0x0
[    0.000000]  reserved[0x1]   [0x00000000000fd740-0x00000000000fd74f], 0x0000000000000010 bytes flags: 0x0
[    0.000000]  reserved[0x2]   [0x0000000010f18000-0x0000000024783fff], 0x000000001386c000 bytes flags: 0x0
[    0.000000]  reserved[0x3]   [0x000000002f000000-0x000000003023efff], 0x000000000123f000 bytes flags: 0x0
[    0.000000] memblock_reserve: [0x000000000009d800-0x00000000000fffff] reserve_bios_regions+0x56/0x58

另一次在numa_register_memblk()函数中。你看差别最重要的是这时的memblock携带了NUMA的信息。

[    0.000000] MEMBLOCK configuration:
[    0.000000]  memory size = 0x00000001f9c4e800 reserved size = 0x0000000014b29800
[    0.000000]  memory.cnt  = 0x7
[    0.000000]  memory[0x0] [0x0000000000001000-0x000000000009cfff], 0x000000000009c000 bytes on node 0 flags: 0x0
[    0.000000]  memory[0x1] [0x0000000000100000-0x00000000ba5b1fff], 0x00000000ba4b2000 bytes on node 0 flags: 0x0
[    0.000000]  memory[0x2] [0x00000000ba5b9000-0x00000000bad8dfff], 0x00000000007d5000 bytes on node 0 flags: 0x0
[    0.000000]  memory[0x3] [0x00000000bafb6000-0x00000000ca8a1fff], 0x000000000f8ec000 bytes on node 0 flags: 0x0
[    0.000000]  memory[0x4] [0x00000000ca93a000-0x00000000ca977fff], 0x000000000003e000 bytes on node 0 flags: 0x0
[    0.000000]  memory[0x5] [0x00000000cafff000-0x00000000caffffff], 0x0000000000001000 bytes on node 0 flags: 0x0
[    0.000000]  memory[0x6] [0x0000000100000000-0x000000022f5fffff], 0x000000012f600000 bytes on node 0 flags: 0x0
[    0.000000]  reserved.cnt  = 0x7
[    0.000000]  reserved[0x0]   [0x0000000000000000-0x000000000000ffff], 0x0000000000010000 bytes on node 0 flags: 0x0
[    0.000000]  reserved[0x1]   [0x0000000000097000-0x000000000009cfff], 0x0000000000006000 bytes on node 0 flags: 0x0
[    0.000000]  reserved[0x2]   [0x000000000009d800-0x00000000000fffff], 0x0000000000062800 bytes on node 0 flags: 0x0
[    0.000000]  reserved[0x3]   [0x0000000010f18000-0x0000000024783fff], 0x000000001386c000 bytes on node 0 flags: 0x0
[    0.000000]  reserved[0x4]   [0x000000002f000000-0x000000003023efff], 0x000000000123f000 bytes on node 0 flags: 0x0
...

获得每个Zone的内存分布

在启动的log中,我们可以获得系统上Zone的分布情况。该信息在free_area_init_nodes()函数中打印。

dmesg | grep -A "Zone ranges:"

结果如下:

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.000000]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x000000022f5fffff]
[    0.000000] Movable zone start for each node

获得内存在node上的分布

在启动的log中,我们可以获得系统上node的分布情况。该信息在free_area_init_nodes()函数中打印。

dmesg | grep -A 8 "node ranges"

结果如下:

[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009cfff]
[    0.000000]   node   0: [mem 0x0000000000100000-0x00000000ba5b1fff]
[    0.000000]   node   0: [mem 0x00000000ba5b9000-0x00000000bad8dfff]
[    0.000000]   node   0: [mem 0x00000000bafb6000-0x00000000ca8a1fff]
[    0.000000]   node   0: [mem 0x00000000ca93a000-0x00000000ca977fff]
[    0.000000]   node   0: [mem 0x00000000cafff000-0x00000000caffffff]
[    0.000000]   node   0: [mem 0x0000000100000000-0x000000022f5fffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x000000022f5fffff]

可以看到这两段内存空间是不一致的。

这是因为zone的信息只是一个边界的信息,而node的信息是真实可用的物理内存的信息。

获得zonelist的顺序

这个工作貌似现在内核没有现成的打印数据了。我自己写了一个

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 560eafe8234d..3eb3a00a0dd2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5050,6 +5050,28 @@ static void set_zonelist_order(void)
        current_zonelist_order = user_zonelist_order;
 }

+static void dump_zonelist(pg_data_t *pgdat)
+{
+   int i;
+   struct zonelist *zonelist;
+
+   pr_info("FALLBACK ZONELIST of node[%d]\n", pgdat->node_id);
+
+   zonelist = &pgdat->node_zonelists[ZONELIST_FALLBACK];
+   for (i = 0; zonelist->_zonerefs[i].zone != NULL; i++) {
+       struct zone *z = zonelist->_zonerefs[i].zone;
+       pr_info("Node[%d]: %s", z->zone_pgdat->node_id, z->name);
+   }
+
+   pr_info("NOFALLBACK ZONELIST of node[%d]\n", pgdat->node_id);
+
+   zonelist = &pgdat->node_zonelists[ZONELIST_FALLBACK];
+   for (i = 0; zonelist->_zonerefs[i].zone != NULL; i++) {
+       struct zone *z = zonelist->_zonerefs[i].zone;
+       pr_info("Node[%d]: %s", z->zone_pgdat->node_id, z->name);
+   }
+}
+
 static void build_zonelists(pg_data_t *pgdat)
 {
    int i, node, load;
@@ -5202,12 +5224,14 @@ static int __build_all_zonelists(void *data)

    if (self && !node_online(self->node_id)) {
        build_zonelists(self);
+       dump_zonelist(self);
    }

    for_each_online_node(nid) {
        pg_data_t *pgdat = NODE_DATA(nid);

        build_zonelists(pgdat);
+       dump_zonelist(pgdat);
    }

    /*

重新编译安装后,你可以看到

[    0.000000] FALLBACK ZONELIST of node[0]
[    0.000000] Node[0]: DMA32
[    0.000000] Node[0]: DMA
[    0.000000] Node[1]: Normal
[    0.000000] Node[1]: DMA32
[    0.000000] NOFALLBACK ZONELIST of node[0]
[    0.000000] Node[0]: DMA32
[    0.000000] Node[0]: DMA
[    0.000000] Node[1]: Normal
[    0.000000] Node[1]: DMA32
[    0.000000] FALLBACK ZONELIST of node[1]
[    0.000000] Node[1]: Normal
[    0.000000] Node[1]: DMA32
[    0.000000] Node[0]: DMA32
[    0.000000] Node[0]: DMA
[    0.000000] NOFALLBACK ZONELIST of node[1]
[    0.000000] Node[1]: Normal
[    0.000000] Node[1]: DMA32
[    0.000000] Node[0]: DMA32
[    0.000000] Node[0]: DMA

这是一个有两个node的系统,所以看到有两个node的数据。有意思

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值