AutoFS + Windows (Samba 1.0) down + NGINX = Freeze

Posted by newbeenginx 
February 12, 2020 10:08AM
We've got auto-fs installed on a CentOS Linux release 7.5.1804 (Core). The config is as follows (comments omitted):

[ autofs ]
timeout = 500
browse_mode = no
mount_nfs_default_protocol = 4
[ amd ]
dismount_interval = 300

/misc /etc/auto.misc
/net -hosts

/mnt/fs1 /etc/auto.conf.d/auto.fs1

* -fstype=cifs,echo_interval=15,cache=none,ro,noserverino,user=nginx,pass=mypassword ://

/mnt/fs2 /etc/auto.conf.d/auto.fs2

* -fstype=cifs,echo_interval=15,cache=none,ro,noserverino,user=nginx,pass=mypassword ://

On top of that we run NGINX:
root@localhost ~> nginx -V
nginx version: nginx/1.16.0
built by gcc 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)
built with OpenSSL 1.0.2k-fips 26 Jan 2017
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fPIC' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -pie'

with the following (internal) site:

user nginx;
worker_processes auto;
error_log /mnt/nginx-cache/var/log/nginx/error.log warn;
pid /run/nginx.pid;

# Load dynamic modules. See /usr/share/nginx/README.dynamic.
# include /usr/share/nginx/modules/*.conf;
load_module /etc/nginx/modules/ngx_http_cache_purge_module.so;

events {
worker_connections 1024;

http {
log_format main_ext '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'"$host" sn="$server_name" '
'rt=$request_time '
'ua="$upstream_addr" us="$upstream_status" '
'ut="$upstream_response_time" ul="$upstream_response_length" '
'cs=$upstream_cache_status kk=$scheme$proxy_host$uri$is_args$args';

access_log /mnt/nginx-cache/var/log/nginx/access.log main_ext buffer=64k flush=2s;
log_format upstream '$remote_addr - $upstream_addr - $request - $upstream_response_time - $request_time - $upstream_cache_status';
log_not_found off;

sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;

http2_max_requests 4294967295;

server_names_hash_max_size 256;
server_names_hash_bucket_size 256;

include /etc/nginx/mime.types;
default_type application/octet-stream;

proxy_cache_key $scheme$proxy_host$uri$is_args$args;

aio threads;
directio 512;

client_max_body_size 512m;

server {
# Note that it's listening on port 9000
listen default_server;
root /mnt;

server_name myorigin.mydomain.com;

aio on;
sendfile on;
directio 1;
location /{
try_files /fs1/myfs$uri /fs2/myfs$uri =404;

This normally works fine. However, as soon as (e.g.) FS2 is down and we request a file that is on there multiple times, NGINX will hang. We've got other servers in the NGINX config too, some of which are only serving as proxy pass. Even those servers hang. This to me would indicate that all worker processes are hanging. Shouldn't this be remedied by having turned on AIO and forcing it to always use directio?

When the physical CentOS server is rebooted while one of the file servers is down, NGINX runs fine. Can auto-fs be configured to, if a file server is down, simply disconnect the mount instead of hanging on any file operation?

How do I make sure NGINX keeps serving even if a mount goes down / hangs?
