Docker socket crash after stack up

2019-04-09 15:47发布

I trying to deploy docker stack, that includes my development environment. But in random cases I have next error:

failed to create service < service_name >: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

Next I restart docker daemon. Sometimes it requires to kill docker processes and shims. I deleting old stack and build again. Some times docker successfully finishes build, but socket crashes on the starting stage.

Also all containers work properly when I starting it in regular mode, without swarm or stack. It is not work exactly inside swarm.

I have used next command to build:

$ docker stack deploy dev-env-stc -c docker-compose.yml

Environment run in Antergos Linux(Arch).

Layout is like at the diagram enter image description here

Nginx container and docker networks created using commands:

$ docker run --detach --name nginx-main --net dev-env-ext --ip 10.20.20.10 --publish 80:80 --publish 443:443 --volume /env-vol/nginx/conf:/etc/nginx:ro --volume /env-vol/nginx/www:/usr/var/www --volume /env-vol/nginx/logs:/usr/var/logs --volume /env-vol/nginx/run:/usr/var/run --volume /env-vol/ssl:/usr/var/ssl:ro nginx-webserver

$ docker network create --driver=bridge --attachable --ipv6 --subnet fd19:eb5a:3d2f:f15d::/48 --subnet 10.20.20.0/24 --gateway 10.20.20.1 dev-env-ext

$ docker network create --driver=bridge --attachable --ipv6 --subnet fd19:eb5a:3e30:f15d::/48 --subnet 10.20.30.0/24 --gateway 10.20.30.1 dev-env-int

$ docker network create --driver=overlay --attachable --ipv6 --subnet fd19:eb5a:3c1e:f15d::/48 --subnet 10.20.40.0/24 --gateway 10.20.40.1 dev-env-swarm

$ docker network connect dev-env-swarm --ip=10.20.40.10 nginx-main

$ docker network connect dev-env-int --ip=10.20.30.10 nginx-main

My docker-compose.yml file:

version: '3.6'
volumes:
  postgres-data:
    driver: local
  redis-data:
    driver: local
networks:
  dev-env-swarm:
    external: true
services:
  gitlab:
    image: gitlab/gitlab-ce:latest
    hostname: gitlab.testenv.top
    external_links: 
      - nginx-main
    ports:
      - 22:22
    healthcheck:
      test: ["CMD", "curl", "-f", "https://localhost:443"]
      interval: 1m30s
      timeout: 10s
      retries: 3
      start_period: 60s
    deploy:
      mode: global
      endpoint_mode: vip
      resources:
        limits:
          cpus: "0.50"
          memory: 4096M
        reservations:
          cpus: "0.10"
          memory: 512M
      restart_policy:
        condition: on-failure
        delay: 20s
        max_attempts: 3
        window: 300s
    networks:
      dev-env-swarm:
        aliases:
          - gitlab.testenv.top
    dns: 
      - 10.10.10.10
      - 8.8.8.8
    volumes:
      - /env-vol/gitlab/config:/etc/gitlab
      - /env-vol/gitlab/logs:/var/log/gitlab
      - /env-vol/gitlab/data:/var/opt/gitlab
    external_links:
      - nginx-main
  redis:
    env_file: .env
    image: redis:3.2.6-alpine
    hostname: redis.testenv.top
    external_links: 
      - nginx-main
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:6379"]
      interval: 1m30s
      timeout: 10s
      retries: 3
      start_period: 60s
    deploy:
      mode: global
      endpoint_mode: dnsrr
      resources:
        limits:
          cpus: "0.20"
          memory: 1024M
        reservations:
          cpus: "0.05"
          memory: 128M
      restart_policy:
        condition: on-failure
        delay: 20s
        max_attempts: 3
        window: 60s
    volumes:
      - redis-data:/var/lib/redis
    command: redis-server --appendonly yes
    networks:
      dev-env-swarm:
        aliases:
          - redis.testenv.top
    dns:
      - 10.10.10.10
      - 8.8.8.8
  redisco:
    image: rediscommander/redis-commander:latest
    hostname: redisco.testenv.top
    external_links: 
      - nginx-main
    depends_on:
      - redis
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8081"]
      interval: 1m30s
      timeout: 10s
      retries: 3
      start_period: 60s
    deploy:
      mode: global
      endpoint_mode: dnsrr
      resources:
        limits:
          cpus: "0.20"
          memory: 512M
        reservations:
          cpus: "0.05"
          memory: 256M
      restart_policy:
        condition: on-failure
        delay: 20s
        max_attempts: 3
        window: 60s
    networks:
      dev-env-swarm:
        aliases:
          - redisco.testenv.top
    dns: 
      - 10.10.10.10
      - 8.8.8.8
    environment: 
      REDIS_PORT: 6379
      REDIS_HOST: redis.testenv.top
  plantuml:
    image: plantuml/plantuml-server:tomcat
    hostname: plantuml.testenv.top
    external_links: 
      - nginx-main
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080"]
      interval: 1m30s
      timeout: 10s
      retries: 3
      start_period: 60s
    deploy:
      mode: global
      endpoint_mode: dnsrr
      resources:
        limits:
          cpus: "0.20"
          memory: 1024M
        reservations:
          cpus: "0.05"
          memory: 256M
      restart_policy:
        condition: on-failure
        delay: 20s
        max_attempts: 3
        window: 60s
    networks:
      dev-env-swarm:
        aliases:
          - plantuml.testenv.top
    dns: 
      - 10.10.10.10
      - 8.8.8.8
  portainer-agent:
    image: portainer/agent
    external_links: 
      - nginx-main
    expose:
      - 9001
    deploy:
      mode: global
      endpoint_mode: dnsrr
      resources:
        limits:
          cpus: "0.20"
          memory: 1024M
        reservations:
          cpus: "0.05"
          memory: 256M
      restart_policy:
        condition: on-failure
        delay: 20s
        max_attempts: 3
        window: 60s
    environment:
      AGENT_CLUSTER_ADDR: tasks.portainer-agent
      AGENT_PORT: 9001
      LOG_LEVEL: debug
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    networks:
      dev-env-swarm:
        aliases:
          - portainer-agent.testenv.top
    deploy:
      mode: global
  portainer:
    image: portainer/portainer
    command: -H tcp://tasks.portainer-agent:9001 --tlsskipverify
    depends_on:
      - portainer-agent
    external_links: 
      - nginx-main
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000"]
      interval: 1m30s
      timeout: 10s
      retries: 3
      start_period: 60s
    deploy:
      mode: global
      endpoint_mode: dnsrr
      resources:
        limits:
          cpus: "0.20"
          memory: 2024M
        reservations:
          cpus: "0.05"
          memory: 512M
      restart_policy:
        condition: on-failure
        delay: 20s
        max_attempts: 3
        window: 60s
    volumes:
      - /env-vol/portainer/data:/data
    hostname: portainer.testenv.top
    networks:
      dev-env-swarm:
        aliases:
          - portainer.testenv.top
    dns: 
      - 10.10.10.10
      - 8.8.8.8 
  pgadmin4:
    image: dpage/pgadmin4:latest
    hostname: pgadmin.testenv.top
    external_links: 
      - nginx-main
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost"]
      interval: 1m30s
      timeout: 10s
      retries: 3
      start_period: 60s
    deploy:
      mode: global
      endpoint_mode: dnsrr
      resources:
        limits:
          cpus: "0.20"
          memory: 1024M
        reservations:
          cpus: "0.05"
          memory: 256M
      restart_policy:
        condition: on-failure
        delay: 20s
        max_attempts: 3
        window: 60s
    environment:
      PGADMIN_DEFAULT_EMAIL: email@example.com
      PGADMIN_DEFAULT_PASSWORD: PASWORD
    networks:
      dev-env-swarm:
        aliases:
          - pgadmin.testenv.top
    dns: 
      - 10.10.10.10
      - 8.8.8.8
    volumes:
      - /env-vol/pgadmin:/var/lib/pgadmin

1条回答
劫难
2楼-- · 2019-04-09 16:30

Problem with socket was from wrong Python installation from sources and manual installation of libs. Looks like I have installed incompatible versions. When I have reinstalled Python from repository this problem wasn't appear again.

But I has concluded that using Docker Stack in development environment is not good decision, better to use Docker Compose for this reasons. So Docker Stack should be used for deployment configurations(like deploy of Docker Swarm or to the Kubernetes or to another cloud environments).

查看更多
登录 后发表回答