Hi All,
We are trying to use NGINX as a load balancing solution for our production servers. Our infrastructure contains few application server using SOAP XML over HTTP. NGNIX suppose to balance between servers. Please see config:
user nginx;
worker_processes 10;
error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;
events {
worker_connections 2048;
}
http {
upstream CS {
server 70.70.0.31:80 max_fails=1 fail_timeout=1s;
server 70.70.0.41:80 max_fails=1 fail_timeout=1s;
}
server {
listen 54321;
location / {
proxy_pass http://CS;
proxy_read_timeout 2s;
# proxy_connect_timeout 200ms;
}
}
}
NGINX does the work great until some of backend server fails to answer to NGINX in time and it send request to other server. Until this point everything is great and working. Now failed server should recover from inoperative mode to operetive mode after 1 second as I've defined fail_timeout to 1 second (Here might be my mistake because it may be recovers but right after it fails again) and it doesn't. I'm getting to the point where NGINX says that there is no operative upstream servers available because in some point load to the servers is so high that both of servers unable to answer in time.
We are working on solving our servivce issue. However I think it is cruicial for load balancer be able moving failed upstream server to operative mode. May be I'm doing something wrong. Which log level will show the recovery?
I will appreciate for any help!
Regards,
Seva Feldman