Welcome! Log In Create A New Profile

Advanced

nginx appears to hang.

Posted by scottsamms 
nginx appears to hang.
May 24, 2010 04:16AM
Hello All.

I was hoping someone would be able to provide some insight into a difficult problem we are seeing.

We use nginx as a reverse proxy. Every day at the same time the nginx server stops responding.

We have tried a variety of steps to solve this problem with no resolve.


Before I get into the symptoms, here is some relevant version information :

Here is the output of a nginx -V on our two servers

nginx version: nginx/0.8.32
built by gcc 4.3.3 (Ubuntu 4.3.3-5ubuntu4)
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --with-http_realip_module --with-http_addition_module --with-http_stub_status_module --with-http_ssl_module --with-http_stub_status_module --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/log/nginx/nginx.pid --with-syslog

nginx version: nginx/0.7.65
built by gcc 4.3.3 (Ubuntu 4.3.3-5ubuntu4)
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --with-http_realip_module --with-http_addition_module --with-http_stub_status_module --with-http_ssl_module --with-http_stub_status_module --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/log/nginx/nginx.pid


Here is the linux kernel information : 2.6.28-11-server #42-Ubuntu SMP


There are two nginx servers connected directly to our router.

Each of these has its own real server IP address.

As an example lets say they are 100.100.10.20 and 100.100.10.21


We also have VIP addresses for websites running through the nginx proxies.

As example lets say we have 5 sites (3 VIP's), 100.100.20.1 , .2, .3, .4 ,.5 .


The problem seems to follow a particular website, but causes whatever other sites running through

that nginx server at the time to fail, as nginx is no longer responding.


tcptraceroutes to both the real server IP and the VIP's show that the port is open.

If we ssh into the server and attempt to connect locally via real IP or VIP, we get the same

result, port open but no repsonse to an http get or anything else. Normally nginx would serve a

static page with the line "<hostname> is up>" when we attempt to connect to the real server IP,

and again this is on the lcoal nginx server that failed, so it is not using the network to connect.

This is not a cronjob or other timed event on our end. we have changed the cron job time, and also changed the date/time on the server with no change.

In trying to solve this issue we have contact numerous people and we have collected data via

running "strace" with nginx. Checking /proc/net/ values at the time of failure, and running tshark on

that server to collect a traffic dump, which shows alot of tcp retransmissions.

Additionally we have tried other versions of nginx, and we have tried checking sysctl values as well

as making changes to those values as per below ,

(first line we check the current value, middle line we change the value, last line we verify the value has changed.) :


-----
cat /proc/sys/net/ipv4/tcp_rfc1337
0

echo 1 > /proc/sys/net/ipv4/tcp_rfc1337


cat /proc/sys/net/ipv4/tcp_rfc1337
1
----
cat /proc/sys/net/ipv4/tcp_max_syn_backlog
1024

echo 5120 > /proc/sys/net/ipv4/tcp_max_syn_backlog

cat /proc/sys/net/ipv4/tcp_max_syn_backlog
5120
----
cat /proc/sys/net/ipv4/tcp_max_tw_buckets
180000

echo 280000 > /proc/sys/net/ipv4/tcp_max_tw_buckets

cat /proc/sys/net/ipv4/tcp_max_tw_buckets
280000
----
cat /proc/sys/net/ipv4/tcp_frto
2

echo 0 > /proc/sys/net/ipv4/tcp_frto

cat /proc/sys/net/ipv4/tcp_frto
0

-----------------------------------------------

As my colleague suggested, here is a printout of sysctl -a , the sysctl current values.
I have excluded ipv6 , kernel , dev, and fs from this list to keep it smaller :

net.ipv4.route.gc_thresh = 32768
net.ipv4.route.max_size = 524288
net.ipv4.route.gc_min_interval = 0
net.ipv4.route.gc_min_interval_ms = 500
net.ipv4.route.gc_timeout = 300
net.ipv4.route.gc_interval = 60
net.ipv4.route.redirect_load = 2
net.ipv4.route.redirect_number = 9
net.ipv4.route.redirect_silence = 2048
net.ipv4.route.error_cost = 100
net.ipv4.route.error_burst = 500
net.ipv4.route.gc_elasticity = 8
net.ipv4.route.mtu_expires = 600
net.ipv4.route.min_pmtu = 552
net.ipv4.route.min_adv_mss = 256
net.ipv4.route.secret_interval = 600
error: permission denied on key 'net.ipv4.route.flush'
net.ipv4.neigh.default.mcast_solicit = 3
net.ipv4.neigh.default.ucast_solicit = 3
net.ipv4.neigh.default.app_solicit = 0
net.ipv4.neigh.default.retrans_time = 100
net.ipv4.neigh.default.base_reachable_time = 30
net.ipv4.neigh.default.delay_first_probe_time = 5
net.ipv4.neigh.default.gc_stale_time = 60
net.ipv4.neigh.default.unres_qlen = 3
net.ipv4.neigh.default.proxy_qlen = 64
net.ipv4.neigh.default.anycast_delay = 100
net.ipv4.neigh.default.proxy_delay = 80
net.ipv4.neigh.default.locktime = 100
net.ipv4.neigh.default.retrans_time_ms = 1000
net.ipv4.neigh.default.base_reachable_time_ms = 30000
net.ipv4.neigh.default.gc_interval = 30
net.ipv4.neigh.default.gc_thresh1 = 128
net.ipv4.neigh.default.gc_thresh2 = 512
net.ipv4.neigh.default.gc_thresh3 = 1024
net.ipv4.neigh.lo.mcast_solicit = 3
net.ipv4.neigh.lo.ucast_solicit = 3
net.ipv4.neigh.lo.app_solicit = 0
net.ipv4.neigh.lo.retrans_time = 100
net.ipv4.neigh.lo.base_reachable_time = 30
net.ipv4.neigh.lo.delay_first_probe_time = 5
net.ipv4.neigh.lo.gc_stale_time = 60
net.ipv4.neigh.lo.unres_qlen = 3
net.ipv4.neigh.lo.proxy_qlen = 64
net.ipv4.neigh.lo.anycast_delay = 100
net.ipv4.neigh.lo.proxy_delay = 80
net.ipv4.neigh.lo.locktime = 100
net.ipv4.neigh.lo.retrans_time_ms = 1000
net.ipv4.neigh.lo.base_reachable_time_ms = 30000
net.ipv4.neigh.eth0.mcast_solicit = 3
net.ipv4.neigh.eth0.ucast_solicit = 3
net.ipv4.neigh.eth0.app_solicit = 0
net.ipv4.neigh.eth0.retrans_time = 100
net.ipv4.neigh.eth0.base_reachable_time = 30
net.ipv4.neigh.eth0.delay_first_probe_time = 5
net.ipv4.neigh.eth0.gc_stale_time = 60
net.ipv4.neigh.eth0.unres_qlen = 3
net.ipv4.neigh.eth0.proxy_qlen = 64
net.ipv4.neigh.eth0.anycast_delay = 100
net.ipv4.neigh.eth0.proxy_delay = 80
net.ipv4.neigh.eth0.locktime = 100
net.ipv4.neigh.eth0.retrans_time_ms = 1000
net.ipv4.neigh.eth0.base_reachable_time_ms = 30000
net.ipv4.neigh.eth1.mcast_solicit = 3
net.ipv4.neigh.eth1.ucast_solicit = 3
net.ipv4.neigh.eth1.app_solicit = 0
net.ipv4.neigh.eth1.retrans_time = 100
net.ipv4.neigh.eth1.base_reachable_time = 30
net.ipv4.neigh.eth1.delay_first_probe_time = 5
net.ipv4.neigh.eth1.gc_stale_time = 60
net.ipv4.neigh.eth1.unres_qlen = 3
net.ipv4.neigh.eth1.proxy_qlen = 64
net.ipv4.neigh.eth1.anycast_delay = 100
net.ipv4.neigh.eth1.proxy_delay = 80
net.ipv4.neigh.eth1.locktime = 100
net.ipv4.neigh.eth1.retrans_time_ms = 1000
net.ipv4.neigh.eth1.base_reachable_time_ms = 30000
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_retrans_collapse = 1
net.ipv4.ip_default_ttl = 64
net.ipv4.ip_no_pmtu_disc = 0
net.ipv4.ip_nonlocal_bind = 0
net.ipv4.tcp_syn_retries = 5
net.ipv4.tcp_synack_retries = 5
net.ipv4.tcp_max_orphans = 32768
net.ipv4.tcp_max_tw_buckets = 280000
net.ipv4.ip_dynaddr = 0
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_retries1 = 3
net.ipv4.tcp_retries2 = 15
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_abort_on_overflow = 0
net.ipv4.tcp_stdurg = 0
net.ipv4.tcp_rfc1337 = 1
net.ipv4.tcp_max_syn_backlog = 5120
net.ipv4.ip_local_port_range = 5000 65000
net.ipv4.igmp_max_memberships = 20
net.ipv4.igmp_max_msf = 10
net.ipv4.inet_peer_threshold = 65664
net.ipv4.inet_peer_minttl = 120
net.ipv4.inet_peer_maxttl = 600
net.ipv4.inet_peer_gc_mintime = 10
net.ipv4.inet_peer_gc_maxtime = 120
net.ipv4.tcp_orphan_retries = 0
net.ipv4.tcp_fack = 1
net.ipv4.tcp_reordering = 3
net.ipv4.tcp_ecn = 0
net.ipv4.tcp_dsack = 1
net.ipv4.tcp_mem = 77376 103168 154752
net.ipv4.tcp_wmem = 4096 16384 3301376
net.ipv4.tcp_rmem = 4096 87380 3301376
net.ipv4.tcp_app_win = 31
net.ipv4.tcp_adv_win_scale = 2
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_frto = 0
net.ipv4.tcp_frto_response = 0
net.ipv4.tcp_low_latency = 0
net.ipv4.tcp_no_metrics_save = 0
net.ipv4.tcp_moderate_rcvbuf = 1
net.ipv4.tcp_tso_win_divisor = 3
net.ipv4.tcp_congestion_control = reno
net.ipv4.tcp_abc = 0
net.ipv4.tcp_mtu_probing = 0
net.ipv4.tcp_base_mss = 512
net.ipv4.tcp_workaround_signed_windows = 0
net.ipv4.tcp_slow_start_after_idle = 1
net.ipv4.cipso_cache_enable = 1
net.ipv4.cipso_cache_bucket_size = 10
net.ipv4.cipso_rbm_optfmt = 0
net.ipv4.cipso_rbm_strictvalid = 1
net.ipv4.tcp_available_congestion_control = reno cubic
net.ipv4.tcp_allowed_congestion_control = reno cubic
net.ipv4.tcp_max_ssthresh = 0
net.ipv4.udp_mem = 388416 517888 776832
net.ipv4.udp_rmem_min = 4096
net.ipv4.udp_wmem_min = 4096
net.ipv4.conf.all.forwarding = 0
net.ipv4.conf.all.mc_forwarding = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 1
net.ipv4.conf.all.shared_media = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.all.proxy_arp = 0
net.ipv4.conf.all.medium_id = 0
net.ipv4.conf.all.bootp_relay = 0
net.ipv4.conf.all.log_martians = 0
net.ipv4.conf.all.tag = 0
net.ipv4.conf.all.arp_filter = 0
net.ipv4.conf.all.arp_announce = 0
net.ipv4.conf.all.arp_ignore = 0
net.ipv4.conf.all.arp_accept = 0
net.ipv4.conf.all.disable_xfrm = 0
net.ipv4.conf.all.disable_policy = 0
net.ipv4.conf.all.force_igmp_version = 0
net.ipv4.conf.all.promote_secondaries = 0
net.ipv4.conf.default.forwarding = 0
net.ipv4.conf.default.mc_forwarding = 0
net.ipv4.conf.default.accept_redirects = 1
net.ipv4.conf.default.secure_redirects = 1
net.ipv4.conf.default.shared_media = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.send_redirects = 1
net.ipv4.conf.default.accept_source_route = 1
net.ipv4.conf.default.proxy_arp = 0
net.ipv4.conf.default.medium_id = 0
net.ipv4.conf.default.bootp_relay = 0
net.ipv4.conf.default.log_martians = 0
net.ipv4.conf.default.tag = 0
net.ipv4.conf.default.arp_filter = 0
net.ipv4.conf.default.arp_announce = 0
net.ipv4.conf.default.arp_ignore = 0
net.ipv4.conf.default.arp_accept = 0
net.ipv4.conf.default.disable_xfrm = 0
net.ipv4.conf.default.disable_policy = 0
net.ipv4.conf.default.force_igmp_version = 0
net.ipv4.conf.default.promote_secondaries = 0
net.ipv4.conf.lo.forwarding = 0
net.ipv4.conf.lo.mc_forwarding = 0
net.ipv4.conf.lo.accept_redirects = 1
net.ipv4.conf.lo.secure_redirects = 1
net.ipv4.conf.lo.shared_media = 1
net.ipv4.conf.lo.rp_filter = 0
net.ipv4.conf.lo.send_redirects = 1
net.ipv4.conf.lo.accept_source_route = 1
net.ipv4.conf.lo.proxy_arp = 0
net.ipv4.conf.lo.medium_id = 0
net.ipv4.conf.lo.bootp_relay = 0
net.ipv4.conf.lo.log_martians = 0
net.ipv4.conf.lo.tag = 0
net.ipv4.conf.lo.arp_filter = 0
net.ipv4.conf.lo.arp_announce = 0
net.ipv4.conf.lo.arp_ignore = 0
net.ipv4.conf.lo.arp_accept = 0
net.ipv4.conf.lo.disable_xfrm = 1
net.ipv4.conf.lo.disable_policy = 1
net.ipv4.conf.lo.force_igmp_version = 0
net.ipv4.conf.lo.promote_secondaries = 0
net.ipv4.conf.eth0.forwarding = 0
net.ipv4.conf.eth0.mc_forwarding = 0
net.ipv4.conf.eth0.accept_redirects = 1
net.ipv4.conf.eth0.secure_redirects = 1
net.ipv4.conf.eth0.shared_media = 1
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.eth0.send_redirects = 1
net.ipv4.conf.eth0.accept_source_route = 1
net.ipv4.conf.eth0.proxy_arp = 0
net.ipv4.conf.eth0.medium_id = 0
net.ipv4.conf.eth0.bootp_relay = 0
net.ipv4.conf.eth0.log_martians = 0
net.ipv4.conf.eth0.tag = 0
net.ipv4.conf.eth0.arp_filter = 0
net.ipv4.conf.eth0.arp_announce = 0
net.ipv4.conf.eth0.arp_ignore = 0
net.ipv4.conf.eth0.arp_accept = 0
net.ipv4.conf.eth0.disable_xfrm = 0
net.ipv4.conf.eth0.disable_policy = 0
net.ipv4.conf.eth0.force_igmp_version = 0
net.ipv4.conf.eth0.promote_secondaries = 0
net.ipv4.conf.eth1.forwarding = 0
net.ipv4.conf.eth1.mc_forwarding = 0
net.ipv4.conf.eth1.accept_redirects = 1
net.ipv4.conf.eth1.secure_redirects = 1
net.ipv4.conf.eth1.shared_media = 1
net.ipv4.conf.eth1.rp_filter = 1
net.ipv4.conf.eth1.send_redirects = 1
net.ipv4.conf.eth1.accept_source_route = 1
net.ipv4.conf.eth1.proxy_arp = 0
net.ipv4.conf.eth1.medium_id = 0
net.ipv4.conf.eth1.bootp_relay = 0
net.ipv4.conf.eth1.log_martians = 0
net.ipv4.conf.eth1.tag = 0
net.ipv4.conf.eth1.arp_filter = 0
net.ipv4.conf.eth1.arp_announce = 0
net.ipv4.conf.eth1.arp_ignore = 0
net.ipv4.conf.eth1.arp_accept = 0
net.ipv4.conf.eth1.disable_xfrm = 0
net.ipv4.conf.eth1.disable_policy = 0
net.ipv4.conf.eth1.force_igmp_version = 0
net.ipv4.conf.eth1.promote_secondaries = 0
net.ipv4.ip_forward = 0
net.ipv4.ipfrag_high_thresh = 262144
net.ipv4.ipfrag_low_thresh = 196608
net.ipv4.ipfrag_time = 30
net.ipv4.icmp_echo_ignore_all = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.icmp_errors_use_inbound_ifaddr = 0
net.ipv4.icmp_ratelimit = 1000
net.ipv4.icmp_ratemask = 6168
net.ipv4.ipfrag_secret_interval = 600
net.ipv4.ipfrag_max_dist = 64
net.token-ring.rif_timeout = 60000

net.unix.max_dgram_qlen = 10
net.core.wmem_max = 131071
net.core.rmem_max = 131071
net.core.wmem_default = 111616
net.core.rmem_default = 111616
net.core.dev_weight = 64
net.core.netdev_max_backlog = 1000
net.core.message_cost = 5
net.core.message_burst = 10
net.core.optmem_max = 10240
net.core.xfrm_aevent_etime = 10
net.core.xfrm_aevent_rseqth = 2
net.core.xfrm_larval_drop = 1
net.core.xfrm_acq_expires = 30
net.core.netdev_budget = 300
net.core.warnings = 1
net.core.somaxconn = 128
crypto.fips_enabled = 0

-----------------------------------------

We are at a loss as to what might be causing the issue.

If anyone know what might be going on here or steps we have not already taken to try to unwrap

this mystery, please help.
Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 146
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready