1

我在具有 SYS_ADMIN 功能的 k8s pod 容器中运行一个程序。该程序分配了 2 MB 的大页面,它成功了。然后在该内存上调用 mlock() ,但失败了。

我查看了 ENOMEM 的手册页,没有一个原因可以解释这个问题。

我尝试在主机上运行程序,工作正常。

我尝试使用具有相同图像的 SYS_ADMIN 在 docker 容器上运行该程序。

我检查了 OCI config.json 文件中直接 docker 案例与下面显示的 k8s 案例之间的区别,我没有看到任何有趣的东西..

版本

axe@axe-tester:~$ cat /proc/version
Linux version 4.15.0-29-generic (buildd@lgw01-amd64-057) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #31-Ubuntu SMP Tue Jul 17 15:39:52 UTC 2018
axe@axe-tester:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.2", GitCommit:"66049e3b21efe110454d67df4fa62b08ea79a19b", GitTreeState:"clean", BuildDate:"2019-05-16T16:23:09Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.2", GitCommit:"66049e3b21efe110454d67df4fa62b08ea79a19b", GitTreeState:"clean", BuildDate:"2019-05-16T16:14:56Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
axe@axe-tester:~$ docker version
Client:
 Version:           18.09.2
 API version:       1.39
 Go version:        go1.10.4
 Git commit:        6247962
 Built:             Tue Feb 26 23:52:23 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.09.2
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.4
  Git commit:       6247962
  Built:            Wed Feb 13 00:24:14 2019
  OS/Arch:          linux/amd64
  Experimental:     false

以下 yaml 文件在 /tmp/test 中使用了测试程序

apiVersion: v1
kind: Pod
metadata:
  name: test
  annotations:
    seccomp.security.alpha.kubernetes.io/pod: docker/default
spec:
  restartPolicy: Never
  containers:
    - name: t
      image: amazonlinux:2
      imagePullPolicy: Never
      command: ["sleep", "1200"]
      securityContext:
        capabilities:
          add: ["SYS_ADMIN", "IPC_LOCK"]
      volumeMounts:
      - mountPath: /test
        name: test
      resources:
        limits:
          hugepages-2Mi: 100Mi
          memory: 100Mi
        requests:
          memory: 100Mi
  volumes:
  - name: test
    hostPath:
      path: /tmp/test

重现步骤:

 kubectl create -f /tmp/test.yml
 kubectl exec -it  test  -- /bin/bash
 # in the container..
 bash-4.2# /test 
 Previous limits: soft=16777216; hard=16777216
 mlock failed: Cannot allocate memory

测试程序

#define MMAP_FLAGS (MAP_PRIVATE | MAP_HUGETLB | MAP_HUGE_2MB| MAP_ANONYMOUS)
#define MMAP_MIN_SIZE (2 * 1024 * 1024)
void *dma_mp_mmap_hugetlb(size_t size)
{
    int err = 0;
    void *va = NULL;

    va = mmap(0, size, PROT_READ | PROT_WRITE, MMAP_FLAGS, -1, 0);
    if (va == MAP_FAILED) {
        perror("mmap failed");
        return MAP_FAILED;
    }

    /* Pin the memory */
    err = mlock(va, size);
    if (err) {
        perror("mlock failed");
        return MAP_FAILED;
    }

    return va;
}


int main (void) {
    struct rlimit old;
    getrlimit(RLIMIT_MEMLOCK, &old);
              printf("Previous limits: soft=%lld; hard=%lld\n", (long long) old.rlim_cur, (long long) old.rlim_max);
    assert(dma_mp_mmap_hugetlb(MMAP_MIN_SIZE) != NULL);
}
4

1 回答 1

0

检查父 cgroup,例如:/sys/fs/cgroup/hugetlb/kubepods.slice/hugetlb.2MB.limit_in_bytes,它可能是 0 或小于您请求的 pod,然后您必须重新启动 kubelet 以刷新 cgroup

于 2020-12-21T13:48:42.580 回答