4

当尝试使用写时复制语义(PROT_READ | PROT_WRITE 和 MAP_PRIVATE)映射 5GB 文件时,会在 2.6.26-2-amd64 Linux 内核上发生这种情况。映射小于 4GB 的文件或仅使用 PROT_READ 可以正常工作。这不是本问题中报告的软资源限制问题;虚拟限制大小是无限的。

这是重现问题的代码(实际代码是Boost.Interprocess的一部分)。

#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>

#include <fcntl.h>
#include <unistd.h>

main()
{
        struct stat b;
        void *base;
        int fd = open("foo.bin", O_RDWR);

        fstat(fd, &b);
        base = mmap(0, b.st_size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
        if (base == MAP_FAILED) {
                perror("mmap");
                return 1;
        }
        return 0;
}

这就是发生的事情:

dd if=/dev/zero of=foo.bin bs=1M seek=5000 count=1
./test-mmap
mmap: Cannot allocate memory

这是相关的 strace (新编译的 4.5.20)输出,正如 nos 所要求的那样。

open("foo.bin", O_RDWR)                 = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=5243928576, ...}) = 0
mmap(NULL, 5243928576, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = -1 ENOMEM (Cannot allocate memory)
dup(2)                                  = 4
[...]
write(4, "mmap: Cannot allocate memory\n", 29mmap: Cannot allocate memory
) = 29
4

2 回答 2

5

MAP_NORESERVE尝试像这样传递flags字段:

mmap(NULL, b.st_size, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_NORESERVE, fd, 0);

您的交换内存和物理内存的组合可能小于请求的 5GB。

或者,您可以出于测试目的执行此操作,如果有效,您可以更改上面的代码:

# echo 0 > /proc/sys/vm/overcommit_memory

以下是手册页的相关摘录。

地图(2):

   MAP_NORESERVE
          Do  not reserve swap space for this mapping.  When swap space is
          reserved, one has the guarantee that it is  possible  to  modify
          the  mapping.   When  swap  space  is not reserved one might get
          SIGSEGV upon a write if no physical memory  is  available.   See
          also  the  discussion of the file /proc/sys/vm/overcommit_memory
          in proc(5).  In kernels before 2.6, this flag  only  had  effect
          for private writable mappings.

过程(5):

   /proc/sys/vm/overcommit_memory
          This file contains the kernel virtual  memory  accounting  mode.
          Values are:

                 0: heuristic overcommit (this is the default)
                 1: always overcommit, never check
                 2: always check, never overcommit

          In  mode 0, calls of mmap(2) with MAP_NORESERVE are not checked,
          and the default check is very weak, leading to the risk of  get‐
          ting a process "OOM-killed".  Under Linux 2.4 any non-zero value
          implies mode 1.  In mode 2  (available  since  Linux  2.6),  the
          total  virtual  address  space on the system is limited to (SS +
          RAM*(r/100)), where SS is the size of the swap space, and RAM is
          the  size  of  the physical memory, and r is the contents of the
          file /proc/sys/vm/overcommit_ratio.
于 2010-09-01T03:53:28.780 回答
2

从您的评论中引用您的内存、交换大小和过度使用设置:

MemTotal: 4063428 kB SwapTotal: 514072 kB
$ cat /proc/sys/vm/overcommit_memory
0
$ cat /proc/sys/vm/overcommit_ratio 
50

设置为 0(“启发式过度使用overcommit_memory”)时,您无法创建大于当前可用内存和交换总量的私有、可写映射 - 显然,因为您只有 4.5GB 内存 + 交换,这永远不会是真的.

您的选择是使用MAP_NORESERVE(正如Matt Joiner建议的那样),如果您确定映射中的脏(写入)页面永远不会超过可用内存和交换的页面;或显着增加交换空间的大小。

于 2010-09-01T06:00:07.793 回答