Welcome! Log In Create A New Profile

Advanced

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

May 14, 2018 11:10AM
Quintin,

I dont know anything about your context, but your setup looks over simplistic. Here are some things that I learned
painfully over a few years of supporting a high traffic retail website

1. Is this a website that's on the internet, and thus exposed to random queries from bots and scrapers that you can’t control?

2. For your cache misses, how long best case, typical and worse case does your back-end take to build the pages?

3. You need to log everything that could feasibly affect the status of the site. For example, here’s a log config urationfrom one gnarly site that I worked on:

log_format main '$http_x_forwarded_for $http_true_client_ip $remote_addr - $remote_user [$time_local] $host "$request" '
'$status $body_bytes_sent $upstream_cache_status $cookie_jsessionid $http_akamai_country $cookie_e4x_country $cookie_e4x_currency "$http_referer" '
'"$http_user_agent" "$request_time”’;

4. the first problem is your cache key, and that it includes $request_uri which is the original uri
including all arguments. So you are already exposed to DOS requests that could be unintentional,
as anyone can bust your cache by adding an extra parameter.

> proxy_cache_key "$scheme://$host$request_uri$do_not_cache";


5. Not caching requests from logged in users is a very blunt tool. Is this a site where only administrative users are logged in?

Imagine a retail site that sells clothing. It’s possible that a dynamic page that lists all the red dresses is something
a logged in user sees. Perhaps the page can be cached ? But if there is a version of the page that shows 30 entries and other
that shows 60 then they need to disambiguated by the cache key. Perhaps users can choose to see prices in Euro instead of USD?
Then this also belongs in the key. If I am an American vacationing in Pari s then perhaps the default behavior should be to show me
Euro prices, based n the value of a cookie that the CDN sets. In the situation the customer may want to override this default behavior
and insist he sees USD prices. You can see how complex this can get.

7. The default behavior is to not cache responses that contain a set-cookie - imagine how cache pollution - sending someone another person’s personal data stored in a cookie could be much worse than a cache miss. But there are also settings where your backend is some legacy software that you dont control
and the correct behavior isn’t to not cache but instead to remove the set-cookie from the response and cache the response without it.

8 How you prime the cache , monitor the cache, and clear the cache are crucial . Perhaps you have a script that uses curl or wget to retrieve a series of pages from your site. If the script is written naively then each step might cause a new servlet session to be created on the backend producing a memory issue.

9. script is very useful to track the health of your cache:

https://github.com/perusio/nginx-cache-inspector https://github.com/perusio/nginx-cache-inspector

10. The if directive in nginx has some issues (see https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/ https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/ )
When I need to use complex configuration logic I use OpenResty. OpenResty is a bundle that
combines the standard nginx with some additional lua modules. It’s still standard nginx -
not forked or customized in any way.

11.

A very cut down version of a cache config for one page follows:

# Product arrays get cached
location ~ /shop/ {
rewrite "/(.*)/2];ord.*$" $1 ;
proxy_no_cache $arg_mid $arg_siteID;
proxy_cache_bypass $arg_mid $arg_siteID;
proxy_cache_use_stale updating;
default_type text/html;
proxy_cache_valid 200 302 301 15m;
proxy_ignore_headers Set-Cookie Cache-Control;
proxy_pass_header off;
proxy_hide_header Set-Cookie;
expires 900s;
add_header Last-Modified "";
add_header ETag "";
# Build cache key
set $e4x_currency $cookie_e4x_currency;
set_if_empty $e4x_currency 'USD';
set $num_items $cookie_EndecaNumberOfItems;
set_if_empty $num_items 'LOW';
proxy_cache_key "$uri|$e4x_currency|$num_items";
proxy_cache product_arrays;
# Add Canonical URL string
set $folder_id $arg_FOLDER%3C%3Efolder_id;
set $canonical_url "http://$http_host$uri";
add_header Link "<$canonical_url>; rel=\"canonical\"";
proxy_pass http://apache$request_uri;
}


Tis snippet shows a key made of three parts. The real version has seven parts.

Good luck!

Peter


> On 14 May 2018, at 12:06 AM, Quintin Par <quintinpar@gmail.com> wrote:
>
>
> Thanks all for the response. Michael, I am going to add those header ignores.
>
> Still puzzled by the large number of MISSEs and I’ve no clue why they are happening. Leads appreciated.
>
>
>
>
> - Quintin
>
> On Sun, May 13, 2018 at 6:12 PM, c0nw0nk <nginx-forum@forum.nginx.org <mailto:nginx-forum@forum.nginx.org>> wrote:
> You know you can DoS sites with Cache MISS via switching up URL params and
> arguements.
>
> Examples :
>
> HIT :
> index.php?var1=one&var2=two
> MISS :
> index.php?var2=two&var1=one
>
> MISS :
> index.php?random=1
> index.php?random=2
> index.php?random=3
> etc etc
>
> Inserting random arguements to URL's will cause cache misses and changing
> the order of existing valid URL arguements will also cause misses.
>
> Cherian Thomas Wrote:
> -------------------------------------------------------
> > Thanks for this Michael.
> >
> >
> >
> > This is so surprising. If someone decides to Dos and crawls the
> > website
> > with a rogue header, this will essentially bypass the cache and put a
> > strain on the website. In fact, I was hit by a dos attack that’s when
> > I
> > started looking at logs and realized the large number of MISSes.
> >
> >
> >
> > Can someone please help?
> >
> >
> > - Cherian
> >
> > On Sat, May 12, 2018 at 12:01 PM, Friscia, Michael
> > <michael.friscia@yale.edu <mailto:michael.friscia@yale.edu>
> > > wrote:
> >
> > > I'm not sure if this will help, but I ignore/hide a lot, this is in
> > my
> > > config
> > >
> > >
> > > proxy_ignore_headers X-Accel-Expires Expires Cache-Control
> > Set-Cookie;
> > > proxy_hide_header X-Accel-Expires;
> > > proxy_hide_header Pragma;
> > > proxy_hide_header Server;
> > > proxy_hide_header Request-Context;
> > > proxy_hide_header X-Powered-By;
> > > proxy_hide_header X-AspNet-Version;
> > > proxy_hide_header X-AspNetMvc-Version;
> > >
> > >
> > > I have not experienced the problem you mention, I just thought I
> > would
> > > offer my config.
> > >
> > >
> > > ___________________________________________
> > >
> > > Michael Friscia
> > >
> > > Office of Communications
> > >
> > > Yale School of Medicine
> > >
> > > (203) 737-7932 – office
> > >
> > > (203) 931-5381 – mobile
> > >
> > > http://web.yale.edu https://mailtrack.io/trace/link/a61adbc81bbb4743e50220408108f7e1b8f3af40?url=http%3A%2F%2Fweb.yale.edu&userId=74734&signature=0767ce63378dc575
> > >
> > <https://mailtrack.io/trace/link/8357a0bdd8c40c2ff5b7d91c7797cbc7a8535 https://mailtrack.io/trace/link/661443b9951f60c19cd0ed2ec67ca9c38485a127?url=https%3A%2F%2Fmailtrack.io%2Ftrace%2Flink%2F8357a0bdd8c40c2ff5b7d91c7797cbc7a8535&userId=74734&signature=fd94611bb5198158
> > ffb?url=http%3A%2F%2Fweb.yale.edu https://mailtrack.io/trace/link/8d2b22d027b9e7af0a2468545c2e35529237af19?url=http%3A%2F%2F2Fweb.yale.edu&userId=74734&signature=5ab2d28a496b50f6%2F&userId=74734&signature=d652edf1f4
> > f21323>
> > >
> > >
> > > ------------------------------
> > > *From:* nginx <nginx-bounces@nginx.org <mailto:nginx-bounces@nginx.org>> on behalf of Quintin Par <
> > > quintinpar@gmail.com <mailto:quintinpar@gmail.com>>
> > > *Sent:* Saturday, May 12, 2018 1:32 PM
> > > *To:* nginx@nginx.org <mailto:nginx@nginx.org>
> > > *Subject:* Re: Debugging Nginx Cache Misses: Hitting high number of
> > MISS
> > > despite high proxy valid
> > >
> > >
> > > That’s the tricky part. These MISSes are intermittent. Whenever I
> > run curl
> > > I get HITs but I end up seeing a lot of MISS in the logs.
> > >
> > >
> > >
> > > How do I log these MiSSes with the reason? I want to know what
> > headers
> > > ended up bypassing the cache.
> > >
> > >
> > >
> > > Here’s my caching config
> > >
> > >
> > >
> > > proxy_pass http://127.0.0.1:8000 https://mailtrack.io/trace/link/071291057b0a07a97c3170df6ceb9706ad0e553d?url=http%3A%2F%2F127.0.0.1%3A8000&userId=74734&signature=21d883fe1973c407
> > >
> > <https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1-3A8000& https://mailtrack.io/trace/link/6864e1b6645eae9d83bd78154bd244cbd3132407?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__127.0.0.1-3A8000%26&userId=74734&signature=05baa72c55f6e580
> > d=DwMFaQ&c=cjytLXgP8ixuoHflwc-poQ&r=wvXEDjvtDPcv7AlldT5UvDx32KXBEM6um_
> > lS023SJrs&m=F-qGMOyS74uE8JM-dOLmNH92bQ1xQ-7Rj1d6k-_WST4&s=NHvlb1WColNw
> > TWBF36P1whJdu5iWHK9_50IDHugaEdQ&e=>
> > > ;
> > >
> > > proxy_set_header X-Real-IP $remote_addr;
> > >
> > > proxy_set_header X-Forwarded-For
> > > $proxy_add_x_forwarded_for;
> > >
> > > proxy_set_header X-Forwarded-Proto https;
> > >
> > > proxy_set_header X-Forwarded-Port 443;
> > >
> > >
> > >
> > > # If logged in, don't cache.
> > >
> > > if ($http_cookie ~*
> > "comment_author_|wordpress_(?!test_cookie)|wp-postpass_"
> > > ) {
> > >
> > > set $do_not_cache 1;
> > >
> > > }
> > >
> > > proxy_cache_key "$scheme://$host$request_uri$
> > > do_not_cache";
> > >
> > > proxy_cache staticfilecache;
> > >
> > > add_header Cache-Control public;
> > >
> > > proxy_cache_valid 200 120d;
> > >
> > > proxy_hide_header "Set-Cookie";
> > >
> > > proxy_ignore_headers "Set-Cookie";
> > >
> > > proxy_ignore_headers "Cache-Control";
> > >
> > > proxy_hide_header "Cache-Control";
> > >
> > > proxy_pass_header X-Accel-Expires;
> > >
> > >
> > >
> > > proxy_set_header Accept-Encoding "";
> > >
> > > proxy_ignore_headers Expires;
> > >
> > > add_header X-Cache-Status $upstream_cache_status;
> > >
> > > proxy_cache_use_stale timeout;
> > >
> > > proxy_cache_bypass $arg_nocache $do_not_cache;
> > > - Quintin
> > >
> > >
> > > On Sat, May 12, 2018 at 10:29 AM Lucas Rolff <lucas@lucasrolff.com <mailto:lucas@lucasrolff.com>>
> > wrote:
> > >
> > > It can be as simple as doing a curl to your “origin” url (the one
> > you
> > > proxy_pass to) for the files you see that gets a lot of MISS’s – if
> > there’s
> > > odd headers such as cookies etc, then you’ll most likely experience
> > a bad
> > > cache if your nginx is configured to not ignore those headers.
> > >
> > >
> > >
> > > *From: *nginx <nginx-bounces@nginx.org <mailto:nginx-bounces@nginx.org>> on behalf of Quintin Par <
> > > quintinpar@gmail.com <mailto:quintinpar@gmail.com>>
> > > *Reply-To: *"nginx@nginx.org <mailto:nginx@nginx.org>" <nginx@nginx.org <mailto:nginx@nginx.org>>
> > > *Date: *Saturday, 12 May 2018 at 18.26
> > > *To: *"nginx@nginx.org <mailto:nginx@nginx.org>" <nginx@nginx.org <mailto:nginx@nginx.org>>
> > > *Subject: *Debugging Nginx Cache Misses: Hitting high number of MISS
> > > despite high proxy valid
> > >
> > >
> > >
> > > [image:
> > >
> > https://mailtrack.io/trace/mail/86a613eb1ce46a4e7fa6f9eb96989cddae6398 https://mailtrack.io/trace/mail/86a613eb1ce46a4e7fa6f9eb96989cddae6398
> > 00.png?u=74734]
> > >
> > > My proxy cache path is set to a very high size
> > >
> > >
> > >
> > > proxy_cache_path /var/lib/nginx/cache levels=1:2
> > > keys_zone=staticfilecache:180m max_size=700m;
> > >
> > > and the size used is only
> > >
> > >
> > >
> > > sudo du -sh *
> > >
> > > 14M cache
> > >
> > > 4.0K proxy
> > >
> > > Proxy cache valid is set to
> > >
> > >
> > >
> > > proxy_cache_valid 200 120d;
> > >
> > > I track HIT and MISS via
> > >
> > >
> > >
> > > add_header X-Cache-Status $upstream_cache_status;
> > >
> > > Despite these settings I am seeing a lot of MISSes. And this is for
> > pages
> > > I intentionally ran a cache warmer an hour ago.
> > >
> > >
> > >
> > > How do I debug why these MISSes are happening? How do I find out if
> > the
> > > miss was due to eviction, expiration, some rogue header etc? Does
> > Nginx
> > > provide commands for this?
> > >
> > >
> > >
> > > - Quintin
> > > _______________________________________________
> > > nginx mailing list
> > > nginx@nginx.org <mailto:nginx@nginx.org>
> > > http://mailman.nginx.org/mailman/listinfo/nginx https://mailtrack.io/trace/link/956685bf1c269e5b5e505d57769f24a31e3e2442?url=http%3A%2F%2Fmailman.nginx.org%2Fmailman%2Flistinfo%2Fnginx&userId=74734&signature=61a29f8655dde16e
> > >
> > <https://mailtrack.io/trace/link/122c3dbd333c388f47f5c2776af9ebc3fc75a https://mailtrack.io/trace/link/0f96ef0fff2b29b47c79cd24c346157878aaf2e5?url=https%3A%2F%2Fmailtrack.io%2Ftrace%2Flink%2F122c3dbd333c388f47f5c2776af9ebc3fc75a&userId=74734&signature=0b1e1864a472eee2
> > e10?url=https%3A%2F%2Furldefense.proofpoint.com https://mailtrack.io/trace/link/5a068de37a59a883da6fd59fdd4026a152a7fc91?url=http%3A%2F%2F2Furldefense.proofpoint.com&userId=74734&signature=ca8f6ddc8276a370%2Fv2%2Furl%3Fu%3Dhttp-
> > 3A__mailman.nginx.org_mailman_listinfo_nginx%26d%3DDwMFaQ%26c%3DcjytLX
> > gP8ixuoHflwc-poQ%26r%3DwvXEDjvtDPcv7AlldT5UvDx32KXBEM6um_lS023SJrs%26m
> > %3DF-qGMOyS74uE8JM-dOLmNH92bQ1xQ-7Rj1d6k-_WST4%26s%3DD3LnZhfobOtlEStCv
> > CDrcwmHydEHaGRFC4gnWvRT5Uk%26e%3D&userId=74734&signature=56c7a7ad18b2c
> > 057>
> > >
> > >
> > > _______________________________________________
> > > nginx mailing list
> > > nginx@nginx.org <mailto:nginx@nginx.org>
> > > http://mailman.nginx.org/mailman/listinfo/nginx https://mailtrack.io/trace/link/f500ef35fc0275c82402a7af89180ae2c67cea6a?url=http%3A%2F%2Fmailman.nginx.org%2Fmailman%2Flistinfo%2Fnginx&userId=74734&signature=aa7675f47e061eec
> > >
> > <https://mailtrack.io/trace/link/92c2700d67bd6891ca1606e2df4e0f11c6d82 https://mailtrack.io/trace/link/d6afed06499ad18204cf041056d4781772869d72?url=https%3A%2F%2Fmailtrack.io%2Ftrace%2Flink%2F92c2700d67bd6891ca1606e2df4e0f11c6d82&userId=74734&signature=59dcf4fe89ac3c3c
> > 260?url=http%3A%2F%2Fmailman.nginx.org https://mailtrack.io/trace/link/3ec600220aa90db4d165256c22910f3c97fa118d?url=http%3A%2F%2F2Fmailman.nginx.org&userId=74734&signature=c116773b55639f01%2Fmailman%2Flistinfo%2Fnginx&us
> > erId=74734&signature=3763121afa828bb7>
> > >
> > _______________________________________________
> > nginx mailing list
> > nginx@nginx.org <mailto:nginx@nginx.org>
> > http://mailman.nginx.org/mailman/listinfo/nginx https://mailtrack.io/trace/link/8e6777181b5012ff78b980aafec44306b2954bae?url=http%3A%2F%2Fmailman.nginx.org%2Fmailman%2Flistinfo%2Fnginx&userId=74734&signature=2adebca7901eccce
>
> Posted at Nginx Forum: https://forum.nginx.org/read.php?2,279764,279771#msg-279771 https://mailtrack.io/trace/link/89e8f350a5c632ccafaadd90a9a8114ecac2e688?url=https%3A%2F%2Fforum.nginx.org%2Fread.php%3F2%2C279764%2C279771%23msg-279771&userId=74734&signature=3a01022d1b56bd07
>
> _______________________________________________
> nginx mailing list
> nginx@nginx.org <mailto:nginx@nginx.org>
> http://mailman.nginx.org/mailman/listinfo/nginx https://mailtrack.io/trace/link/8e6777181b5012ff78b980aafec44306b2954bae?url=http%3A%2F%2Fmailman.nginx.org%2Fmailman%2Flistinfo%2Fnginx&userId=74734&signature=2adebca7901eccce
> _______________________________________________
> nginx mailing list
> nginx@nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx

_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Subject Author Posted

Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Quintin Par May 12, 2018 12:28PM

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Lucas Rolff May 12, 2018 12:32PM

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Quintin Par May 12, 2018 01:34PM

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

wickedhangover May 12, 2018 02:02PM

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Cherian Thomas May 13, 2018 01:32AM

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

c0nw0nk May 13, 2018 06:12PM

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Quintin Par May 14, 2018 12:08AM

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

wickedhangover May 14, 2018 07:36AM

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

pbooth May 14, 2018 11:10AM

Re: Debugging Nginx Cache Misses: Hitting high number of MISS despite high proxy valid

Quintin Par May 15, 2018 11:36AM



Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 122
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 500 on July 15, 2024
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready