Welcome! Log In Create A New Profile

Advanced

Re: [PATCH] Improve performance when starting nginx with a lot of locations

Sergey Kandaurov
October 17, 2023 07:26AM
> On 11 Oct 2023, at 02:56, Maxim Dounin <mdounin@mdounin.ru> wrote:
>
> Hello!
>
> On Thu, Oct 05, 2023 at 10:51:26AM +0900, Yusuke Nojima wrote:
>
>> Thank you for your comment!
>>
>>> Could you please provide some more details about the use case,
>>> such as how locations are used, and what is the approximate number
>>> of locations being used?
>>
>> Our team provides development environments to our company's engineers and QA.
>> In this environment, engineers and QA can freely create VMs and deploy
>> applications on them.
>>
>> Our nginx has the role of routing requests from the internet to all
>> applications deployed in this environment.
>> Additionally, it allows setting IP address restrictions, BASIC
>> authentication, TLS client authentication, and other configurations
>> for each application.
>>
>> To implement these requirements, we generate a location for each application.
>> Currently, there are approximately 40,000 locations in our environment.
>
> Thank you for the details. Such configuration looks somewhat
> sub-optimal, but understandable for a development / test
> environment. And certainly 40k locations is a lot for the sorting
> algorithm currently used.
>
>>> Further, since each location contains configuration for
>>> all modules, such configurations are expected to require a lot of
>>> memory
>>
>> Each of our nginx processes was consuming 5GB of memory in terms of
>> resident size.
>> This is not a problem as our servers have sufficient memory.
>>
>>> Rather, I would suggest recursive top-bottom merge sort implementation
>>> instead, which is much simpler and uses stack as temporary storage
>>> (so it'll naturally die if there will be a queue which requires
>>> more space for sorting than we have).

For the record, in my tests on M1 sorting 26^3 locations fit into
32k stack size (16k stack size renders the environment unusable).
Judging by this (unscientific) test, running out of stack should
not be a practicable issue.

>>>
>>> Please take a look if it works for you:
>>
>> I think this implementation is simple and easy to understand.
>> Although the number of traversals of the list will increase compared
>> to bottom-up, it will not affect the order.
>> I believe this will provide sufficient optimization in terms of speed.
>
> Thanks for looking. In my limited testing, it is slightly faster
> than your bottom-up implementation (and significantly faster than
> the existing insertion sort when many locations are used).
>
> Below is the full patch (code unchanged), I'll commit it as soon
> as some other nginx developer will review it.
>
> # HG changeset patch
> # User Maxim Dounin <mdounin@mdounin.ru>
> # Date 1696977468 -10800
> # Wed Oct 11 01:37:48 2023 +0300
> # Node ID b891840852ee5cc823eee1769d092ab50928919f
> # Parent cdda286c0f1b4b10f30d4eb6a63fefb9b8708ecc
> Core: changed ngx_queue_sort() to use merge sort.
>
> This improves nginx startup times significantly when using very large number
> of locations due computational complexity of the sorting algorithm being

due to

> used (insertion sort is O(n*n) on average, while merge sort is O(n*log(n))).

nitpicking: E2MANYPARENS

Using a colon might looks better (I don't mind though):
used: insertion sort is O(n*n) on average, while merge sort is O(n*log(n)).

> In particular, in a test configuration with 20k locations total startup
> time is reduced from 8 seconds to 0.9 seconds.
>
> Prodded by Yusuke Nojima,
> https://mailman.nginx.org/pipermail/nginx-devel/2023-September/NUL3Y2FPPFSHMPTFTL65KXSXNTX3NQMK.html

I like the change, please commit.

The thing to keep in mind is that it pessimizes the best case
of sorted locations, which is O(n) with the insertion sort.
Though given that both old and new algorithms give relatively
good numbers for the best case (and it is hard to get a
noticeable startup delay with very large number of locations
in practice using merge sort), especially compared to the worst
case of sorting perfectly reverse sorted locations with the
insertion sort, I believe this is acceptable.

>
> diff --git a/src/core/ngx_queue.c b/src/core/ngx_queue.c
> --- a/src/core/ngx_queue.c
> +++ b/src/core/ngx_queue.c
> @@ -9,6 +9,10 @@
> #include <ngx_core.h>
>
>
> +static void ngx_queue_merge(ngx_queue_t *queue, ngx_queue_t *tail,
> + ngx_int_t (*cmp)(const ngx_queue_t *, const ngx_queue_t *));
> +
> +
> /*
> * find the middle queue element if the queue has odd number of elements
> * or the first element of the queue's second part otherwise
> @@ -45,13 +49,13 @@ ngx_queue_middle(ngx_queue_t *queue)
> }
>
>
> -/* the stable insertion sort */
> +/* the stable merge sort */
>
> void
> ngx_queue_sort(ngx_queue_t *queue,
> ngx_int_t (*cmp)(const ngx_queue_t *, const ngx_queue_t *))
> {
> - ngx_queue_t *q, *prev, *next;
> + ngx_queue_t *q, tail;
>
> q = ngx_queue_head(queue);
>
> @@ -59,22 +63,44 @@ ngx_queue_sort(ngx_queue_t *queue,
> return;
> }
>
> - for (q = ngx_queue_next(q); q != ngx_queue_sentinel(queue); q = next) {
> + q = ngx_queue_middle(queue);
> +
> + ngx_queue_split(queue, q, &tail);
> +
> + ngx_queue_sort(queue, cmp);
> + ngx_queue_sort(&tail, cmp);
> +
> + ngx_queue_merge(queue, &tail, cmp);
> +}
>
> - prev = ngx_queue_prev(q);
> - next = ngx_queue_next(q);
>
> - ngx_queue_remove(q);
> +static void
> +ngx_queue_merge(ngx_queue_t *queue, ngx_queue_t *tail,
> + ngx_int_t (*cmp)(const ngx_queue_t *, const ngx_queue_t *))
> +{
> + ngx_queue_t *q1, *q2;
> +
> + q1 = ngx_queue_head(queue);
> + q2 = ngx_queue_head(tail);
>
> - do {
> - if (cmp(prev, q) <= 0) {
> - break;
> - }
> + for ( ;; ) {
> + if (q1 == ngx_queue_sentinel(queue)) {
> + ngx_queue_add(queue, tail);
> + break;
> + }
> +
> + if (q2 == ngx_queue_sentinel(tail)) {
> + break;
> + }
>
> - prev = ngx_queue_prev(prev);
> + if (cmp(q1, q2) <= 0) {
> + q1 = ngx_queue_next(q1);
> + continue;
> + }
>
> - } while (prev != ngx_queue_sentinel(queue));
> + ngx_queue_remove(q2);
> + ngx_queue_insert_before(q1, q2);
>
> - ngx_queue_insert_after(prev, q);
> + q2 = ngx_queue_head(tail);
> }
> }
> diff --git a/src/core/ngx_queue.h b/src/core/ngx_queue.h
> --- a/src/core/ngx_queue.h
> +++ b/src/core/ngx_queue.h
> @@ -47,6 +47,9 @@ struct ngx_queue_s {
> (h)->prev = x
>
>
> +#define ngx_queue_insert_before ngx_queue_insert_tail
> +
> +
> #define ngx_queue_head(h) \
> (h)->next
>
>

[..]

--
Sergey Kandaurov
_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
https://mailman.nginx.org/mailman/listinfo/nginx-devel
Subject Author Views Posted

[PATCH] Improve performance when starting nginx with a lot of locations

Yusuke Nojima 458 September 22, 2023 03:00AM

Re: [PATCH] Improve performance when starting nginx with a lot of locations

Maxim Dounin 133 September 29, 2023 11:40PM

Re: [PATCH] Improve performance when starting nginx with a lot of locations

Yusuke Nojima 114 October 04, 2023 09:52PM

Re: [PATCH] Improve performance when starting nginx with a lot of locations

Илья Шипицин 138 October 05, 2023 02:18AM

Re: [PATCH] Improve performance when starting nginx with a lot of locations

Yusuke Nojima 110 October 05, 2023 10:40PM

Re: [PATCH] Improve performance when starting nginx with a lot of locations

Maxim Dounin 104 October 10, 2023 06:58PM

Re: [PATCH] Improve performance when starting nginx with a lot of locations

Yusuke Nojima 110 October 11, 2023 06:34AM

Re: [PATCH] Improve performance when starting nginx with a lot of locations

Sergey Kandaurov 126 October 17, 2023 07:26AM

Re: [PATCH] Improve performance when starting nginx with a lot of locations

Maxim Dounin 146 October 17, 2023 10:04PM



Sorry, you do not have permission to post/reply in this forum.

Online Users

Guests: 99
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 500 on July 15, 2024
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready