Welcome! Log In Create A New Profile

Advanced

Re: Failed disk + proxy_intercept_errors

Maxim Dounin
February 13, 2020 10:00AM
Hello!

On Wed, Feb 12, 2020 at 10:36:54AM -0500, chocholo3 wrote:

> Hi,
> In our deployment we do have configuration of proxy cache with multiple hard
> drives. Because of performance we don't have any RAID on these devices. That
> means we have to handle even a situation when drive dies, sometime.
>
> After disk failure of proxy_cache_path device nginx usually starts serving
> users with http500. So I've had an idea we may use proxy_intercept_errors
> but I end up with inconsistent state: ~60 files are handled as expected, but
> after that every connection is terminated prematurely without a single byte
> sent. In access.log there is http 200.
>
> I broke just ext4 FS (dd if=/dev/zero of=/dev/sdc bs=1k count=$((1024*100)))
> and I'm using nginx 1.17.7 on Linux

[...]

> Am I doing something wrong or is this a bug? Because of the inconsistency I
> tend to the 2nd. But I'm not sure at all :-)

First of all, the proxy_intercept_errors directive is only
relevant to errors returned by upstream servers. As long as the
error is generated by nginx itself, only the error_page directives
are relevant - as long as you have error_page 500 configured,
nginx will appropriately redirect processing of errors with code
500.

As for the inconsistency you observe, this depends on the exact
moment the error happens. For some errors nginx might be able to
generate friendly 500, for some it won't and will close the
connection as long as an error happens.

For example, if an error happens when reading cache header, nginx
should be able to return 500. But if an error happens later, when
reading the response body from the cache file, when the response
headers are already processed (and either sent to the client or
buffered due to postpone_output), it certainly won't be possible
to return a friendly error page, so nginx will close the
connection.

Given the nature of your test, I suspect that the inconsistency
you observe is due to errors happening at different moments.

In the real life, using "error_page 500" is certainly not enough
to protect users from broken responses due to failing disks.
Further, I don't think there is way to fully protect users, except
by providing redundancy at the disk level. For example, consider
an error when reading some response body data from disk, with 1GB
of the response body already sent to the client. There is more or
less nothing to be done here, and the only option is to close the
connection.

--
Maxim Dounin
http://mdounin.ru/
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
Subject Author Posted

Failed disk + proxy_intercept_errors

chocholo3 February 12, 2020 10:36AM

Re: Failed disk + proxy_intercept_errors

Maxim Dounin February 13, 2020 10:00AM

Re: Failed disk + proxy_intercept_errors

chocholo3 February 14, 2020 04:14AM

Re: Failed disk + proxy_intercept_errors

Maxim Dounin February 17, 2020 10:56AM



Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 259
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready