Hello,
I'm having an issue where nginx (1.8) cache manager suddenly just stops deleting content thus the disk soon ends up being full until I restart it by hand. After it is restarted, it works normally for a couple of days, but then it happens again. Cache has some 30-40k files, nothing huge. Relevant config lines are:
proxy_cache_path /home/cache/ levels=2:2 keys_zone=cache:25m inactive=7d max_size=2705g use_temp_path=on;
proxy_temp_path /dev/shm/temp; # reduces parallel writes on the disk
proxy_cache_lock on;
proxy_cache_lock_age 10s;
proxy_cache_lock_timeout 30s;
proxy_ignore_client_abort on;
Server gets roughly 100 rps and normally cache manager deletes a couple of files every few seconds, however when it gets stuck this is all it does for 20-30 minutes or more, i.e. there are 0 unlinks (until I restart it and it rereads the on-disk cache):
...
epoll_wait(14, {}, 512, 1000) = 0
epoll_wait(14, {}, 512, 1000) = 0
epoll_wait(14, {}, 512, 1000) = 0
epoll_wait(14, {}, 512, 1000) = 0
gettid() = 11303
write(24, "2016/02/18 08:22:02 [alert] 11303#11303: ignore long locked inactive cache entry 380d3f178017bcd92877ee322b006bbb, count:1\n", 123) = 123
gettid() = 11303
write(24, "2016/02/18 08:22:02 [alert] 11303#11303: ignore long locked inactive cache entry 7b9239693906e791375a214c7e36af8e, count:24\n", 124) = 124
epoll_wait(14, {}, 512, 1000) = 0
...
I assume the mentioned error is due to relatively often nginx restarts and is benign. There's nothing else in the error log (except for occasional upstream timeouts). I'm aware this likely isn't enough info to debug the issue, but do you at least have some ideas on what might be causing this issue, where to look? I'm wild guessing cache manager waits for some lock to be released, but it never gets released so it just waits indefinitely.
Thanks,
Vedran