My swarm server is broken(Linux system error), sadly it is only one node.
I read https://docs.docker.com/v17.09/engine/swarm/admin_guide/#back-up-the-swarm
So I tried to backup /var/lib/docker/swarm
and restore it on a new set up docker server as below:
The new docker daemon works fine without any swarm feature, but swarm feature doesn't work like:
$ docker service ls
Error response from daemon: This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again.
I think that I need to force re-init swarm manager:
docker swarm init --force-new-cluster
After that, every command relate to swarm like
docker service ls
has no response, those make docker daemon hang.
Then I tried to extract data from back files, and I found this https://medium.com/lucjuggery/raft-logs-on-swarm-mode-1351eff1e690 seems useful. But I still can't recovery secrets.
Only get something like:
secrets: <
secret_id: "6vtndjswxr4fe9kxjtmmtk6af"
secret_name: "DATABASE_ADMIN_URL"
file: <
name: "_DATABASE_ADMIN_URL"
uid: "0"
gid: "0"
mode: -r--r--r--
>
>
which doesn't include useful data.
BTW: I'm not hacking the server, I hope recovery the data instead of going investigate all configs for bundle service.
It took me few hours today to figure out why docker daemon hangs after
I believe there is one step missing from the official doc https://docs.docker.com/v17.09/engine/swarm/admin_guide/#restore-from-a-backup
Because after I removed
docker-state.json
then ranthings work as expect.