how to get secrets from broken docker swarm

2020-04-10 02:54发布

问题:

My swarm server is broken(Linux system error), sadly it is only one node.

I read https://docs.docker.com/v17.09/engine/swarm/admin_guide/#back-up-the-swarm

So I tried to backup /var/lib/docker/swarm and restore it on a new set up docker server as below:

The new docker daemon works fine without any swarm feature, but swarm feature doesn't work like:

$ docker service ls
Error response from daemon: This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again.

I think that I need to force re-init swarm manager:

docker swarm init --force-new-cluster

After that, every command relate to swarm like

docker service ls

has no response, those make docker daemon hang.

Then I tried to extract data from back files, and I found this https://medium.com/lucjuggery/raft-logs-on-swarm-mode-1351eff1e690 seems useful. But I still can't recovery secrets.

Only get something like:

 secrets: <
        secret_id: "6vtndjswxr4fe9kxjtmmtk6af"
        secret_name: "DATABASE_ADMIN_URL"
        file: <
          name: "_DATABASE_ADMIN_URL"
          uid: "0"
          gid: "0"
          mode: -r--r--r--
        >
      >

which doesn't include useful data.

BTW: I'm not hacking the server, I hope recovery the data instead of going investigate all configs for bundle service.

回答1:

It took me few hours today to figure out why docker daemon hangs after

docker swarm init --force-new-cluster

I believe there is one step missing from the official doc https://docs.docker.com/v17.09/engine/swarm/admin_guide/#restore-from-a-backup

Because after I removed docker-state.json then ran

docker swarm init --force-new-cluster --advertise-addr <the-server-ip>:2377

things work as expect.