Welcome! Log In Create A New Profile

Advanced

Re: [PATCH] Added asm ISB as asm pause for ngx_cpu_pause() for aarch64

Maxim Dounin
December 08, 2023 11:28PM
Hello!

Thank you for the patch.
Some comments and questions below.

On Wed, Dec 06, 2023 at 10:06:57AM -0600, julio.suarez@foss.arm.com wrote:

> # HG changeset patch
> # User Julio Suarez <julio.suarez@arm.com>
> # Date 1701877879 21600
> # Wed Dec 06 09:51:19 2023 -0600
> # Node ID 53d289b8676fc678ca90e02ece174300a3631f79
> # Parent f366007dd23a6ce8e8427c1b3042781b618a2ade
> Added asm ISB as asm pause for ngx_cpu_pause() for aarch64

Nitpicking:

Added ISB as ngx_cpu_pause() for aarch64.

>
> For aarch64 (A.K.A. Arm64), ngx_cpu_pause() evaluates to empty by the
> GCC preprocessor. This results in an empty for loop in the lock
> check code in ngx_rwlock.c/ngx_spinlock.c (a bug).
> Thus, on Arm CPUs, there is no wait between checks of a lock.

Could you please clarify what do you mean by "a bug"? An empty
ngx_cpu_pause() is certainly not a bug, it's just a lack of a more
optimal solution for the particular architecture.

> When a delay like the PAUSE that is there for x86 is added,
> there is a 2-5% increase in the number of requests/sec Arm CPUs
> can achieve under high load.

Could you please provide some details on how did you get these
numbers?

> Currently, options for a spin lock delay element
> are YIELD and ISB. YIELD is assembled into a NOP for most Arm CPUs.
> YIELD is for Arm implementations with SMT.
> Most Arm implementations are single core - single thread (non-SMT).
> Thus, this commit uses ISB (Instruction Barrier Sync) since it
> will provide a slightly longer delay than a NOP.
>
> Other projects that implement spin locks have used ISB as
> the delay element which has yielded performance gains as well.

Looking through various open source projects, I'm seeing the
following uses on arm64 CPUs:

FreeBSD, sys/arm64/include/cpu.h:

#define cpu_spinwait() __asm __volatile("yield" ::: "memory")

FreeBSD, lib/libthr/arch/aarch64/include/pthread_md.h:

#define CPU_SPINWAIT

Linux, arch/arm64/include/asm/vdso/processor.h:

static inline void cpu_relax(void)
{
asm volatile("yield" ::: "memory");
}

The only popular project I was able to find which uses ISB is
Rust:

https://github.com/rust-lang/rust/commit/c064b6560b7ce0adeb9bbf5d7dcf12b1acb0c807

From the commit log it looks like it mostly focuses on the delay
introduced by the instruction, ignoring other effects. In
particular, YIELD is expected to be more friendly for various
emulation environments, see Linux commit here:

https://github.com/torvalds/linux/commit/1baa82f48030f38d1895301f1ec93acbcb3d15db

Overall, the YIELD instruction seems to be better suited and
specifically designed for the task in question, as per the
documentation
(https://developer.arm.com/documentation/ddi0596/2021-12/Base-Instructions/YIELD--YIELD-):

: YIELD is a hint instruction. Software with a multithreading
: capability can use a YIELD instruction to indicate to the PE that
: it is performing a task, for example a spin-lock, that could be
: swapped out to improve overall system performance. The PE can use
: this hint to suspend and resume multiple software threads if it
: supports the capability.

Please also note that ngx_cpu_pause() is only used on
multiprocessor systems: the code checks if (ngx_ncpu > 1) before
using it.

>
> Last, ISB is not an architecturally defined solution
> for short spin lock delays.
> A candidate for a short delay solution is the WFET
> instruction (Wait For Event with Timeout).
> However, there aren't any Arm implementations in the market that
> support this instruction yet. When that occurs,
> WFET can be tested as a replacement for ISB. Until then,
> ISB will do.
>
> diff -r f366007dd23a -r 53d289b8676f src/os/unix/ngx_atomic.h
> --- a/src/os/unix/ngx_atomic.h Tue Nov 14 15:26:02 2023 +0400
> +++ b/src/os/unix/ngx_atomic.h Wed Dec 06 09:51:19 2023 -0600
> @@ -66,6 +66,8 @@
>
> #if ( __i386__ || __i386 || __amd64__ || __amd64 )
> #define ngx_cpu_pause() __asm__ ("pause")
> +#elif ( __aarch64__ )
> +#define ngx_cpu_pause() __asm__ ("isb" ::: "memory")

Could you please clarify reasons for the "memory" clobber here?

> #else
> #define ngx_cpu_pause()
> #endif

--
Maxim Dounin
http://mdounin.ru/
_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
https://mailman.nginx.org/mailman/listinfo/nginx-devel
Subject Author Views Posted

[PATCH] Added asm ISB as asm pause for ngx_cpu_pause() for aarch64

Anonymous User 333 December 06, 2023 11:08AM

Re: [PATCH] Added asm ISB as asm pause for ngx_cpu_pause() for aarch64

Maxim Dounin 63 December 08, 2023 11:28PM

Re: [PATCH] Added asm ISB as asm pause for ngx_cpu_pause() for aarch64

Julio Suarez 58 December 11, 2023 06:10PM

Re: [PATCH] Added asm ISB as asm pause for ngx_cpu_pause() for aarch64

Maxim Dounin 56 December 12, 2023 07:18PM

Re: [PATCH] Added asm ISB as asm pause for ngx_cpu_pause() for aarch64

Julio Suarez 60 December 13, 2023 05:18PM

Re: [PATCH] Added asm ISB as asm pause for ngx_cpu_pause() for aarch64

Maxim Dounin 77 December 13, 2023 11:56PM



Sorry, you do not have permission to post/reply in this forum.

Online Users

Guests: 211
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready