J Carter
April 03, 2023 01:16AM
Hello Maxim,

Thank you for the feedback.

I think the points you made are fair - a new directive is possibly
overkill for
this issue.

A single peer going through all of it's (many) cached connections
when there is is a non-asynchronous close connection error is where I've
personally seen the current behavior be most problematic - so this make
and your direction would still resolve this.

Point 3 would increase overhead/add latency for the average case, as a new
connection would need to be created (assuming additional cached connections
are likely to exist, and to not fail). I'd prefer to avoid that unless
there are strong
objections, although implementing this would be fairly trivial if there are.

However, for point 2 -  I do have one question that I'd like your
opinion on
regarding multi-peer upstreams - should a (suspected) asynchronous close
event count as a failure in terms of the logic in upstream_round_robin's

In lieu of double trying for multi-peer, it seems like it may be
desirable to avoid
counting these as 'real' failures given all the effects imparted through
health checks  - such as triggering a failure increment (and/or timeout)
as well as
adjusting weights downwards if the weighs are set for that upstream peer.

On the other hand, the ambiguity in the cause of the error means not
failures at all for connection errors that involve a cached connection.

On 03/04/2023 02:42, Maxim Dounin wrote:
> Hello!
> On Sun, Apr 02, 2023 at 06:57:03PM +0100, J Carter wrote:
>> I've also attached an example nginx.conf and test script that
>> simulates the asynchronous close events.
>> Two different test cases can be found within that, one with path
>> /1 for single peer upstream and /2 for multi-peer.
>> You should see 2 upstream addresses repeated in a row
>> per-upstream-server in the access log by default, as it fails
>> through the cached connections & next performs next upstream
>> tries.
>> Any feedback would be appreciated.
>> # HG changeset patch
>> # User jordanc.carter@outlook.com
>> # Date 1680457073 -3600
>> # Sun Apr 02 18:37:53 2023 +0100
>> # Node ID 9ec4d7a8cdf6cdab00d09dff75fa6045f6f5533f
>> # Parent 5f1d05a21287ba0290dd3a17ad501595b442a194
>> Added keepalive_async_fails command to keepalive load balancer module.
>> This value determines the number suspected keepalive race events
>> per-upstream-try that will be tolerated before a subsequent network connection
>> error is considered a true failure.
> It looks like you are trying to address issues with re-trying
> requests to upstream servers when errors not distinguishable from
> asynchronous close events happens, as outlined in ticket #2421
> (https://trac.nginx.org/nginx/ticket/2421).
> I don't think that introducing another directive for this is a
> good idea. Rather, I would consider modifying the behaviour to
> better behave in case of such errors. In particular, the
> following approaches look promising:
> - Allow at most one additional try.
> - Allow an additional try only if there is a single peer, so
> normally request is not going to be retried at all.
> - Don't use cached connections if an error considered to be an
> asynchronous close event happens.
> Given that since nginx 1.15.3 we have keepalive_timeout in the
> upstream blocks to mitigate potential asynchronous close events,
> even something simple like combination of (1) and (2) might be
> good enough. With (3), things are going to be as correct as it
> can be.
> [...]
nginx-devel mailing list
Subject Author Views Posted

[PATCH] Added keepalive_async_fails command

J Carter 412 April 02, 2023 01:58PM

Re: [PATCH] Added keepalive_async_fails command

J Carter 136 April 02, 2023 02:32PM

Re: [PATCH] Added keepalive_async_fails command

Maxim Dounin 109 April 02, 2023 09:44PM

Re: [PATCH] Added keepalive_async_fails command

J Carter 221 April 03, 2023 01:16AM

Sorry, you do not have permission to post/reply in this forum.

Online Users

Guests: 332
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready