November 12, 2010 04:36AM
Hello community !

I'm currently using nginx as a proxy cache to a backend where large files are stored and I have a big blocking issue.
I'm using the following structure:

Client --- Cache1 --- Cache2 --- HTTP with large files

The problem I'm experiencing is in Cache1 and happens whenever a large file (not cached) is requested by hundreds users.

The proxy_cache module will buffer the request to disk, writing it to temporary proxy cache path, but, because of the excessive number of requests, it will write the same file hundreds times, until is fully cached (and then will be served from cache).

This cause having also 2000 files open for writing 4GB at the same time, blocking the workers, issuing kernel panics and smashing literally the machine.

What I wanted to ask you guys, is : is there a way that nginx will buffer the file only one time and then all the workers may even use the same buffer to serve the uncached requests? So the temp file will be written only one time?

To work around this, I wrote a small perl script that I can share with you if you need, that basically do the following.

In nginx, with EmbeddedPerlModule, I open a file in /dev/shm and read each line to find the proxy_cache_key.
If the key is there, then it will set a variable that will be passed to proxy_no_cache to 1. If the key is not there, proxy_no_cache will be set to 0 and the relative proxy_cache_key will be appended in the last line of the /dev/shm/cache/keys file.

This will cause nginx to use cache only for the first request and to fetch the file directly from Cache2 for all the successive requests.

Moreover, to avoid situations where that cache keys keep staying there forever, possibly blocking new file's caching, I have another small script that run outside nginx (perl script) that open the /dev/shm/cache/keys file and scan it line by line.
Foreach line it will compute the md5 hash and check if the file is in cache folder. If yes, will remove the line from the file. If not, then it opens all temporary cache files (I know it might sound bad, but I can't find a better way right now), and read the second line that contains "KEY: http://www.example.com/file.bin". If the KEY match the key in /dev/shm/cache/keys file, then the line is keept there. If no temporary files contain that key, then the key is removed from the keys file, so that it will be cached at the next request without problems.

Do you have any better suggestion / patches that may fix the problem I have?

Please, consider that I get thousands connections per time (2000+), requesting a mix of large (mostly) and small files. I tried to play with all the proxy_cache module settings but had no luck.

Thank you in advance for any help !

Kind regards,

Paolo Iannelli
Subject Author Posted

Proxy caching large files with too many temporary files

p.iannelli November 12, 2010 04:36AM



Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 202
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready