Set vm.max_map_count on cluster nodes

2019-02-23 10:22发布

问题:

I try to install ElasticSearch (latest) on a cluster nodes on Google Container Engine but ElasticSearch needs the variable : vm.max_map_count to be >= 262144.

If I ssh to every nodes and I manually run :

sysctl -w vm.max_map_count=262144

All goes fine then, but any new node will not have the specified configuration.

So my questions is :

Is there a way to load a system configuration on every nodes at boot time ? Deamon Set would not be the good solution because inside a docker container, the system variables are read-only.

I'm using a fresh created cluster with the gci node image.

回答1:

You should be able to use a DaemonSet to emulate the behavior of a startup script. If the script needs to do root-level actions on the node, you can configure the DaemonSet pods to run in privileged mode.

For an example of how to do this, see https://github.com/kubernetes/contrib/tree/master/startup-script



回答2:

I found another solution while looking at this repository.

It relies on the use of an init container, the plus side is that only the init container is running with privileges:

annotations:
    pod.beta.kubernetes.io/init-containers: '[
      {
      "name": "sysctl",
        "image": "busybox",
        "imagePullPolicy": "IfNotPresent",
        "command": ["sysctl", "-w", "vm.max_map_count=262144"],
        "securityContext": {
          "privileged": true
        }
      }
    ]'

There is a new syntax available since Kubernetes 1.6 which still works for 1.7. Starting with 1.8 this new syntax is required. The declaration of init containers is moved to spec:

  - name: init-sysctl
    image: busybox
    command:
    - sysctl
    - -w
    - vm.max_map_count=262144
    imagePullPolicy: IfNotPresent
    securityContext:
      privileged: true


回答3:

As Robert pointed out, a DaemonSet could run as a startup script. Unfortunately, GKE will only let you run a DaemonSet with restartPolicy set as Always.

So in order to prevent k8s to continually restart the container after running sysctl, it has to sleep after the setup and preferably just run on selected nodes. It isn't an elegant solution, but it's elastic at least.

Example:

es-host-setup Dockerfile:

FROM alpine
CMD sysctl -w vm.max_map_count=262144; sleep 365d

DaemonSet resource file:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: es-host-setup
spec:
  template:
    metadata:
      labels:
        name: es-host-setup
    spec:
      containers:
      - name: es-host-setup
        image: es-host-setup
        securityContext:
          privileged: true
      restartPolicy: Always
      nodeSelector:
        pool: elasticsearch