Welcome! Log In Create A New Profile

Advanced

DNS Load Balancing keeps getting upstream errors

August 30, 2017 09:04PM
Hello!

I was excited to learn that nginx is one of the few load balnacing software supporting DNS. In my EC2 setup, I have nginx running on an m4.large instance, my DNS test load comes from a t2.micro one. I have two nameservers to be load balanced, each running on t2.medium.

Here is my config:
$ cat /etc/nginx/nginx.conf
# For more information on configuration, see:
# * Official English Documentation: http://nginx.org/en/docs/
# * Official Russian Documentation: http://nginx.org/ru/docs/

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
worker_rlimit_nofile 65536;

# Load dynamic modules. See /usr/share/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;

events {
worker_connections 4096;
}

http {
server {
listen 80 default_server;
location / {
stub_status on;
access_log off;
}
}
}

stream {
upstream dns_servers {
server 10.67.32.10:53 max_fails=2000 fail_timeout=30;
server 10.67.16.10:53 max_fails=2000 fail_timeout=30;
}

server {
listen 53 udp;
proxy_pass dns_servers;
error_log /var/log/nginx/dns.log warn;
proxy_responses 1;
proxy_timeout 1s;
}
}

For the test load, I use dnsperf as follows (on the other instance):
dnsperf -s <nginx_host_ip> -d query.txt -l 60 -c 100 -Q 10000

(that is simulating 100 clients collectively making 10k requests/second to the nginx load balancer, for 60 seconds)
query.txt contains just a single CNAME managed in Route53. So the test basically repeatedly asks to resolve this CNAME.

During the tests, nginx would start to throttle the upstream servers, printing out messages such as these:
2017/08/31 00:45:46 [warn] 31728#0: *605752 upstream server temporarily disabled while proxying connection, udp client: 10.67.15.238, server: 0.0.0.0:53, upstream: "10.67.16.10:53", bytes from/to client:43/0, bytes from/to upstream:0/43
2017/08/31 00:45:46 [warn] 31728#0: *605774 upstream server temporarily disabled while proxying connection, udp client: 10.67.15.238, server: 0.0.0.0:53, upstream: "10.67.16.10:53", bytes from/to client:43/0, bytes from/to upstream:0/43
2017/08/31 00:45:46 [warn] 31728#0: *605786 upstream server temporarily disabled while proxying connection, udp client: 10.67.15.238, server: 0.0.0.0:53, upstream: "10.67.16.10:53", bytes from/to client:43/0, bytes from/to upstream:0/43
2017/08/31 00:45:46 [error] 31728#0: *605805 no live upstreams while connecting to upstream, udp client: 10.67.15.238, server: 0.0.0.0:53, upstream: "dns_servers", bytes from/to client:43/0, bytes from/to upstream:0/0

dnsperf would print lots of requests timing out (limit is 5 seconds), and the overall performance is bad:
Queries sent: 94790
Queries completed: 94450 (99.64%)
Queries lost: 340 (0.36%)

Response codes: NOERROR 94450 (100.00%)
Average packet size: request 43, response 106
Run time (s): 60.997054
Queries per second: 1548.435438

Average Latency (s): 0.043772 (min 0.000493, max 1.011284)
Latency StdDev (s): 0.202529

As you can see, the queries/s is a mere 1.5k requests/second, instead of 10k/sec as desired.

I've verified that each nameserver itself can handle the traffic just fine (running the test against the nameserver directly from the same test instance):
dnsperf -s 10.67.16.10 -d query.txt -l 60 -c 100 -Q 10000
[...]
Queries sent: 599999
Queries completed: 599581 (99.93%)
Queries lost: 418 (0.07%)

Response codes: NOERROR 599581 (100.00%)
Average packet size: request 43, response 106
Run time (s): 60.000539
Queries per second: 9992.926897

Average Latency (s): 0.000794 (min 0.000645, max 0.026699)
Latency StdDev (s): 0.000750

From I can tell, it seems nginx is throttling the nameservers because of perceived failures in getting responses from them. How can I troubleshoot this further?

Also, has anyone tried using nginx for DNS load balancing in production? I'd appreciate learning about your setup as well. Anything special to do to handle the possible TCP traffic when the response is large?

Thanks for reading! I greatly appreciate any reply. :")

Regards,
mangysushi
Subject Author Posted

DNS Load Balancing keeps getting upstream errors

mangysushi August 30, 2017 09:04PM

AW: DNS Load Balancing keeps getting upstream errors

Lukas Tribus August 31, 2017 03:42AM



Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 183
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready