Kmalloc()函数与用户空间的malloc()一族函数非常类似,只是它多了一个flags参数。用它可以获得以字节为单位的一块内核内存。
- 在<Slab.h(include/linux)>中
- /*
- * Fallback definitions for an allocator not wanting to provide
- * its own optimized kmalloc definitions (like SLOB).
- */
- /**
- * kmalloc - allocate memory
- * @size: how many bytes of memory are required.
- * @flags: the type of memory to allocate.
- *
- * kmalloc is the normal method of allocating memory
- * in the kernel.
- *
- * The @flags argument may be one of:
- *
- * %GFP_USER - Allocate memory on behalf of user. May sleep.
- *
- * %GFP_KERNEL - Allocate normal kernel ram. May sleep.
- *
- * %GFP_ATOMIC - Allocation will not sleep.
- * For example, use this inside interrupt handlers.
- *
- * %GFP_HIGHUSER - Allocate pages from high memory.
- *
- * %GFP_NOIO - Do not do any I/O at all while trying to get memory.
- *
- * %GFP_NOFS - Do not make any fs calls while trying to get memory.
- *
- * Also it is possible to set different flags by OR'ing
- * in one or more of the following additional @flags:
- *
- * %__GFP_COLD - Request cache-cold pages instead of
- * trying to return cache-warm pages.
- *
- * %__GFP_DMA - Request memory from the DMA-capable zone.
- *
- * %__GFP_HIGH - This allocation has high priority and may use emergency pools.
- *
- * %__GFP_HIGHMEM - Allocated memory may be from highmem.
- *
- * %__GFP_NOFAIL - Indicate that this allocation is in no way allowed to fail
- * (think twice before using).
- *
- * %__GFP_NORETRY - If memory is not immediately available,
- * then give up at once.
- *
- * %__GFP_NOWARN - If allocation fails, don't issue any warnings.
- *
- * %__GFP_REPEAT - If allocation fails initially, try once more before failing.
- */
- static inline void *kmalloc(size_t size, gfp_t flags)
- {
- return __kmalloc(size, flags);
- }
该函数返回一个指向内存块的指针,其内存块至少要有size大小。所分配的内存区在物理上是连续的。
- gfp_mask标志
这些标志可分为三类:行为修饰符、区修饰符及类型。
行为修饰符表示内核应当如何分配所需的内存。
区修饰符表示从哪里分配内存。
类型标志组合了行为修饰符和区修饰符,将各种可能用到的组合归纳为不同类型,简化了修饰符的使用。
1.行为修饰符
- 在<Gfp.h>中
- /*
- * Action modifiers - doesn't change the zoning
- *
- * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt
- * _might_ fail. This depends upon the particular VM implementation.
- *
- * __GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
- * cannot handle allocation failures.
- *
- * __GFP_NORETRY: The VM implementation must not retry indefinitely.
- */
- #define __GFP_WAIT ((__force gfp_t)0x10u) /* Can wait and reschedule? */
- #define __GFP_HIGH ((__force gfp_t)0x20u) /* Should access emergency pools? */
- #define __GFP_IO ((__force gfp_t)0x40u) /* Can start physical IO? */
- #define __GFP_FS ((__force gfp_t)0x80u) /* Can call down to low-level FS? */
- #define __GFP_COLD ((__force gfp_t)0x100u) /* Cache-cold page required */
- #define __GFP_NOWARN ((__force gfp_t)0x200u) /* Suppress page allocation failure warning */
- #define __GFP_REPEAT ((__force gfp_t)0x400u) /* Retry the allocation. Might fail */
- #define __GFP_NOFAIL ((__force gfp_t)0x800u) /* Retry for ever. Cannot fail */
- #define __GFP_NORETRY ((__force gfp_t)0x1000u)/* Do not retry. Might fail */
- #define __GFP_NO_GROW ((__force gfp_t)0x2000u)/* Slab internal usage */
- #define __GFP_COMP ((__force gfp_t)0x4000u)/* Add compound page metadata */
- #define __GFP_ZERO ((__force gfp_t)0x8000u)/* Return zeroed page on success */
- #define __GFP_NOMEMALLOC ((__force gfp_t)0x10000u) /* Don't use emergency reserves */
- #define __GFP_HARDWALL ((__force gfp_t)0x20000u) /* Enforce hardwall cpuset memory allocs */
- #define __GFP_THISNODE ((__force gfp_t)0x40000u)/* No fallback, no policies */
- #define __GFP_BITS_SHIFT 20 /* Room for 20 __GFP_FOO bits */
- #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
大多数分配都会指定这些修饰符,但一般不是这样直接指定,而是采用类型标志。
2.区修饰符
通常分配可以从任何区开始。不过内核优先从ZONE_NORMAL开始,这样可以确保其他区在需要时有足够的空闲页可供使用。
- /*
- * GFP bitmasks..
- *
- * Zone modifiers (see linux/mmzone.h - low three bits)
- *
- * Do not put any conditional on these. If necessary modify the definitions
- * without the underscores and use the consistently. The definitions here may
- * be used in bit comparisons.
- */
- #define __GFP_DMA ((__force gfp_t)0x01u) /*从ZONE_DMA分配*/
- #define __GFP_HIGHMEM ((__force gfp_t)0x02u) /*从ZONE_HIGHMEM或ZONE_NORMAL分配*/
- #define __GFP_DMA32 ((__force gfp_t)0x04u)
指定以上标志中的一个就可以改变内核试图进行分配的区。如果没有指定任何标志,则内核从ZONE_DMA和ZONE_NORMAL进行分配,当然优先从ZONE_NORMAL进行分配。
不能给__get_free_pages()或kmalloc()指定__GFP_HIGHMEM,以因为这两个函数返回的都是逻辑地址,而不是page结构,这两个函数分配的内存当前有可能还没有映射到虚拟地址空间,因为也可能没有逻辑地址。只有alloc_pages()才能分配高端内存。
3.类型标志
类型标志指定所需的行为和区描述符以完成特殊类型的处理。
- /* if you forget to add the bitmask here kernel will crash, period */
- #define GFP_LEVEL_MASK (__GFP_WAIT|__GFP_HIGH|__GFP_IO|__GFP_FS| /
- __GFP_COLD|__GFP_NOWARN|__GFP_REPEAT| /
- __GFP_NOFAIL|__GFP_NORETRY|__GFP_NO_GROW|__GFP_COMP| /
- __GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE)
- /* This equals 0, but use constants in case they ever change */
- #define GFP_NOWAIT (GFP_ATOMIC & ~__GFP_HIGH)
- /* GFP_ATOMIC means both !wait (__GFP_WAIT not set) and use emergency pool */
- #define GFP_ATOMIC (__GFP_HIGH)
- #define GFP_NOIO (__GFP_WAIT)
- #define GFP_NOFS (__GFP_WAIT | __GFP_IO)
- #define GFP_KERNEL (__GFP_WAIT | __GFP_IO | __GFP_FS)
- #define GFP_USER (__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
- #define GFP_HIGHUSER (__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL | /
- __GFP_HIGHMEM)
- #ifdef CONFIG_NUMA
- #define GFP_THISNODE (__GFP_THISNODE | __GFP_NOWARN | __GFP_NORETRY)
- #else
- #define GFP_THISNODE ((__force gfp_t)0)
- #endif
- /* Flag - indicates that the buffer will be suitable for DMA. Ignored on some
- platforms, used as appropriate on others */
- #define GFP_DMA __GFP_DMA
- /* 4GB DMA on some platforms */
- #define GFP_DMA32 __GFP_DMA32
内核中最常用的标志是GFP_KERNEL。这种分配可能会引起睡眠,它使用的是普通优先级。
如何使用这些标志
情形 | 相应标志 |
进程上下文,可以睡眠 | 使用GFP_KRENEL |
进程上下文,不可以睡眠 | 使用GFP_ATOMIC,在你睡眠之前或之后用GFP_KERNEL执行内存分配 |
中断处理程序 | 使用GFP_ATOMIC |
软中断 | 使用GFP_ATOMIC |
tasklet | 使用GFP_ATOMIC |
需要用于DMA的内存,可以睡眠 | 使用(GFP_DMA|GFP_KERNALE) |
需要用于DMA的内存,不可以睡眠 | 使用(GFP_DMA|GFP_ATOMIC),或在你睡眠之前执行内存分配 |
- kfree()
该函数用于释放有kmalloc()分配出来的内存块。如果释放的内存不是kmalloc()分配的,或者该内存早就被释放了,调用这个函数会导致严重的后果。注意,调用kfree(NULL)是安全的。
- 在<Slab.c(mm)>中
- /**
- * kfree - free previously allocated memory
- * @objp: pointer returned by kmalloc.
- *
- * If @objp is NULL, no operation is performed.
- *
- * Don't free memory not originally allocated by kmalloc()
- * or you will run into trouble.
- */
- void kfree(const void *objp)
- {
- struct kmem_cache *c;
- unsigned long flags;
- if (unlikely(!objp))
- return;
- local_irq_save(flags);
- kfree_debugcheck(objp);
- c = virt_to_cache(objp);
- debug_check_no_locks_freed(objp, obj_size(c));
- __cache_free(c, (void *)objp);
- local_irq_restore(flags);
- }