I'm very new to docker and productionizing nodejs web apps. However, after some reading I've determined that a good setup would be:
- nginx container serving static files, ssl, proxying nodejs requests
- nodejs container
- postgesql container
However, I'm now trying to tackle scalability. Seeing as you can define multiple proxy_pass
statements in an nginx config, could you not spin up a duplicate nodejs container (exactly the same but exposing a different port) and effectively "load balance" your web app? Is it a good architecture?
Also, how would this effect database writes? Are there race conditions I need to specifically architecture for? Any guidance would be appreciated.
Yes, it's possible to use Nginx to load balance requests between different instances of your Node.js services. Each Node.js instance could be running in a different Docker container. Increasing the scalability of your configuration is as easy as starting up another Docker container and ensure it's registered in the Nginx config. (Depending on how often you have to update the Nginx config, a variety of tools/frameworks are available to do this last step automatically.)
For example, below is an Nginx configuration to load balance incoming requests across different Node.js services. In our case, we have multiple Node.js services running on the same machine, but it's perfectly possible to use Docker containers instead.
File /etc/nginx/sites-enabled/apps
:
upstream apps-cluster {
least_conn;
server localhost:8081;
server localhost:8082;
server localhost:8083;
keepalive 512;
}
server {
listen 8080;
location "/" {
proxy_next_upstream error timeout http_500 http_502 http_503 http_504;
proxy_set_header Connection "";
proxy_http_version 1.1;
proxy_pass http://apps-cluster;
}
access_log off;
}
Despite running multiple instances of your Node.js services, your database should not be negatively affected. The PostgreSQL database itself can perfectly handle multiple open connections and automatically resolves any race conditions. From a developer point of view, the code for running 1 Node.js service is the same as for running x Node.js services.
You can set "Function Level Concurrent Execution Limit" on the function you are using to connect to RDS. This will contain the number of RDS connections. The requests from Dynamo will be throttled though.
Another option is to stream them into Kinesis or SQS from this lambda and have another worker lambda to read it from there and pump the data into RDS. This is scalable and reliable with no throttling.