Hello again.
Maxim Dounin Wrote:
-------------------------------------------------------
> Hello!
>
> On Thu, Feb 25, 2016 at 05:46:32AM -0500, vergil wrote:
>
> > vedranf Wrote:
> > -------------------------------------------------------
> > > Hello,
> > >
> > > I'm having an issue where nginx (1.8) cache manager suddenly just
> > > stops deleting content thus the disk soon ends up being full until
> I
> > > restart it by hand. After it is restarted, it works normally for a
> > > couple of days, but then it happens again. Cache has some 30-40k
> > > files, nothing huge. Relevant config lines are:
>
> [...]
>
> > We have the same problem, but i'm not sure, that this is caused by
> often
> > nginx restarts.
>
> This particular case was traced to segmentation faults, likely
> caused by 3rd party modules.
>
> [...]
>
> > Also, i think it's somehow related to write connection leak. (see
> image
> > link)
> >
> >
> https://s3.eu-central-1.amazonaws.com/drive-public-eu/nginx/betelgeuse
> _nginx_connections.PNG
>
> [...]
>
> > As you see write connections continuously grows. (When we had to
> power off
> > the machine it's reached ~60k).
> >
> > For counting nginx connections we use standard
> http_stub_status_module.
> >
> > I think that nginx "reference counter" could be broken, because
> total
> > established TCP connection remains the same all the time.
>
> Writing connections will grow due to segmentation faults as well,
> so you are likely have the same problem. See basic
> recommendations in my initial answer in this threads.
I've maded custom nginx build using latest version (1.9.13) without 3rd party modules:
nginx -V
nginx version: nginx/1.9.13
built by gcc 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.1)
built with OpenSSL 1.0.1f 6 Jan 2014
TLS SNI support enabled
configure arguments: --with-cc-opt='-g -O2 -fPIE -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2' --with-ld-opt='-Wl,-Bsymbolic-functions -fPIE -pie -Wl,-z,relro -Wl,-z,now' --prefix=/usr/share/nginx --conf-path=/etc/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --http-scgi-temp-path=/var/lib/nginx/scgi --http-uwsgi-temp-path=/var/lib/nginx/uwsgi --with-debug --with-pcre-jit --with-ipv6 --with-http_ssl_module --with-http_stub_status_module --with-http_realip_module --with-http_auth_request_module --with-http_addition_module --with-http_dav_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_v2_module --with-http_sub_module --with-http_xslt_module --with-stream --with-stream_ssl_module --with-mail --with-mail_ssl_module --with-threads
Nothing changed: connections continuously grow, cache manager works fine and not filled the disk yet, but i think it's a matter of 2-3 days.
PIDs didn't changed since the start, and log didn't contain "worker process exited ..." messages.
Regards,
Alexander.