linux-kernel - 我可以知道在 Linux 内核中 GFP_HARDWALL 标志的用途是什么吗？

Question

GFP 标志用于内存分配。Linux 内核中 GFP_HARDWALL 标志的用途是什么？

score 6 · Accepted Answer

它限制了对当前 cpuset 的分配，其中 cpuset 是（猜想！）一组 CPU 和内存节点。

基本上，您的用户进程可能被限制在由 CPU#1 和 CPU#2 组成的 cpuset 中，但不是 CPU#3 或 CPU#4。也许有一些内存 MEM#1 是 CPU #1 和 #2 的本地内存，而不是其他内存，所以这个内存是 cpuset 的一部分。可能还有一些其他内存 MEM#2 位于 CPU #3 和 #4 的本地，所以这不是 cpuset 的一部分。

__GFP_HARDWALL 确保您不能从 MEM#2 进行分配。

score 3 · Accepted Answer

我不能肯定地说。您可能指的是__GFP_HARDWALL哪个是您不应该真正查看的内部符号。尽管如此，这是我的发现。

从include/linux/gfp.h：

评论#define __GFP_HARDWALL：

/* 强制硬件 cpuset 内存分配 */

并不是说我真的明白这句话中硬墙的含义，但你可能会。__GFP_HARDWALL用于和的定义GFP_USER中。从前三个开始，我猜它与用户空间有关。GFP_HIGHUSERGFP_HIGHUSER_MOVABLEGFP_CONSTRAINT_MASK

从kernel/cpuset.c：

函数注释__cpuset_node_allowed_softwall（部分省略）：

* cpuset_node_allowed_softwall - Can we allocate on a memory node?
* ...
* If we're in interrupt, yes, we can always allocate. If __GFP_THISNODE is
* set, yes, we can always allocate. If node is in our task's mems_allowed,
* yes. If it's not a __GFP_HARDWALL request and this node is in the nearest
* hardwalled cpuset ancestor to this task's cpuset, yes. If the task has been
* OOM killed and has access to memory reserves as specified by the TIF_MEMDIE
* flag, yes.
* Otherwise, no.
*
* If __GFP_HARDWALL is set, cpuset_node_allowed_softwall() reduces to
* cpuset_node_allowed_hardwall(). Otherwise, cpuset_node_allowed_softwall()
* might sleep, and might allow a node from an enclosing cpuset.
*
* cpuset_node_allowed_hardwall() only handles the simpler case of hardwall
* cpusets, and never sleeps.
*
* <OMITTED>
*
* GFP_USER allocations are marked with the __GFP_HARDWALL bit,
* and do not allow allocations outside the current tasks cpuset
* unless the task has been OOM killed as is marked TIF_MEMDIE.
* GFP_KERNEL allocations are not so marked, so can escape to the
* nearest enclosing hardwalled ancestor cpuset.
*
* Scanning up parent cpusets requires callback_mutex. The
* __alloc_pages() routine only calls here with __GFP_HARDWALL bit
* _not_ set if it's a GFP_KERNEL allocation, and all nodes in the
* current tasks mems_allowed came up empty on the first pass over
* the zonelist. So only GFP_KERNEL allocations, if all nodes in the
* cpuset are short of memory, might require taking the callback_mutex
* mutex.
*
* The first call here from mm/page_alloc:get_page_from_freelist()
* has __GFP_HARDWALL set in gfp_mask, enforcing hardwall cpusets,
* so no allocation on a node outside the cpuset is allowed (unless
* in interrupt, of course).
*
* <OMITTED>
*
* Rule:
* Don't call cpuset_node_allowed_softwall if you can't sleep, unless you
* pass in the __GFP_HARDWALL flag set in gfp_flag, which disables
* the code that might scan up ancestor cpusets and sleep.

在同一个文件中，还提到了硬墙 cpuset 和硬墙内存。仍然不确定硬墙到底是什么意思，但让我们跟随它到 cpuset。

（在瓷砖架构中有很多提到硬墙，但由于这是唯一的，我相信它与我们在这里谈论的内容无关）。

我中了大奖。cpusets 上的文档说：

1.4 什么是独占 cpuset？

如果一个 cpuset 是 cpu 或 mem 独占的，那么除了直接的祖先或后代之外，没有其他 cpuset 可以共享任何相同的 CPU 或内存节点。

一个 cpuset.mem_exclusive或cpuset.mem_hardwall 的 cpuset 是“硬墙”的，即它限制了内核在多个用户之间共享的页面、缓冲区和其他数据的内核分配。所有的cpuset，无论是否硬墙，都会限制用户空间的内存分配。这样可以配置系统，以便多个独立作业可以共享公共内核数据，例如文件系统页面，同时将每个作业的用户分配隔离在自己的 cpuset 中。为此，请构造一个大型 mem_exclusive cpuset 来保存所有作业，并为每个单独的作业构造子非 mem_exclusive cpuset。即使是 mem_exclusive cpuset，也只允许将少量典型内核内存（例如来自中断处理程序的请求）带出。

我将把这个留给你，因为我做出的任何结论实际上都可能是错误的。希望在这个特定领域更有知识的人会过来启发我们。

linux-kernel - 我可以知道在 Linux 内核中 GFP_HARDWALL 标志的用途是什么吗？

2 回答 2

1.4 什么是独占 cpuset？

Related

Reference