We use docker swarm with service discovery for Backend REST application. The services in swarm are configured with endpoint_mode: vip
and are running in global
mode. Nginx is proxy passed with service discovery aliases. When we update Backend services sometimes nginx throws 502 as service discovery may point to the updating service.
In such case, We wanted to retry the same endpoint again. How can we achieve this?
According to this we added upstream with the host's private IP and used proxy_next_upstream error timeout http_502;
but still the problem persists.
nginx.conf
upstream servers {
server 192.168.1.2:443; #private ip of host machine
server 192.168.1.2:443 backup;
}
server {
listen 443 ssl http2 default_server;
listen [::]:443 ssl http2 default_server;
proxy_next_upstream http_502;
location /endpoint1 {
proxy_pass http://docker.service1:8080/endpoint1;
}
location /endpoint2 {
proxy_pass http://docker.service2:8080/endpoint2;
}
location /endpoint3 {
proxy_pass http://docker.service3:8080/endpoint3;
}
}
Here if http://docker.service1:8080/endpoint1 throws 502 we want to hit http://docker.service1:8080/endpoint1 again.
Additional queries:
- Is there any way in docker swarm to make it stop pointing to updating service in service discovery till that service is fully up?
- Is upstream necessary here since we directly use docker service discovery?
I suggest you add a health check directly at container level (here)
By doing so, docker pings periodically an endpoint you specified, if it's found unhealthy it will 1) stop routing traffic to it 2) kill the container and restart a new one. Therefore you upstream will be resolved to one of the healthy containers. No need to retry.
As for your additional questions, the first one, docker won't start routing til it's healthy. The second, nginx is still useful to distribute traffic according to endpoint url. But personally nginx + swarm vip mode is not a great choice because swarm load balancer is poorly documented, it doesn't support sticky session and you can't have proxy level health check, I would use
traefik
instead, it has its own load balancer.