Welcome! Log In Create A New Profile

Advanced

Re: [PATCH] Removed the unsafe ngx_memcmp() wrapper for memcmp(3)

Alejandro Colomar
November 06, 2022 05:52PM
Hello Maxim!

On 11/5/22 03:39, Maxim Dounin wrote:
> The ngx_memcpy() macro was introduced in 107:b5be4b0448d3
> (nginx-0.0.1-2003-07-01-19:00:03 import) as an alias to memcmp(),
> with the following explanation why it's just an alias:
>
> +/* msvc and icc compile memcmp() to inline loop */
> +#define ngx_memcmp memcmp
>
> This was along with the first use of memcmp in nginx, so clearly
> the "ngx_" prefix is not an accident.

Hmm, interesting point.

>
> Indeed, the prefix is used in multiple places to provide optimized
> or instrumented functions, even if normally they map to generally
> available functions such as memcmp(). For example, ngx_memcpy()
> usually maps to memcpy(), but instrumented to detect large copies
> if NGX_MEMCPY_LIMIT is defined.

Yeah, I can understand a wrapper over memcpy() and some others for those reasons.

These days of heavily optimized assembly specializations in libc, it's not so
easy to outperform the libc implementation, but it might still be interesting to
check when there are large bottlenecks.

memcmp(3) is likely to be less of a candidate for instrumenting and even less
for optimization, but might be useful to have it around if its already there. I
would not write a wrapper if it didn't exist, but probably I wouldn't remove it
now that it exists. That makes some sense to me.


> Even if the particular function
> is not instrumented, the prefix is important to make it possible
> to change things without huge modifications of the code, like the
> one in your patch. Further, the prefix is also important for code
> consistency.

You may be interested in replacing these few that I just found, for added
consistency:

src/core/ngx_proxy_protocol.c:397:
memcpy(&src_sockaddr.sockaddr_in.sin_addr, in->src_addr, 4);
src/core/ngx_proxy_protocol.c:401:
memcpy(&dst_sockaddr.sockaddr_in.sin_addr, in->dst_addr, 4);
src/core/ngx_proxy_protocol.c:424:
memcpy(&src_sockaddr.sockaddr_in6.sin6_addr, in6->src_addr, 16);
src/core/ngx_proxy_protocol.c:428:
memcpy(&dst_sockaddr.sockaddr_in6.sin6_addr, in6->dst_addr, 16);

> And, last but not least, it makes it possible to add
> casts if needed on some platforms
> (or remove them if they are
> considered wrong).
>
> In particular, casts to (const char *) for ngx_memcmp() were added
> in 2007 by this commit (nginx 0.6.18):
>
> changeset: 1648:89a47f19b9ec
> user: Igor Sysoev <igor@sysoev.ru>
> date: Fri Nov 23 17:00:11 2007 +0000
> summary: update ngx_memcmp()
>
> Unfortunately, this commit contains no details about the
> particular system which needed the cast, though it looks like
> there were some at least at that time.

That date can make sense. 'void *' implementations were added after ANSI/ISO
C89. Systems designed before that year, would mostly have 'char *'. Since some
systems live up to 20+ years, by 2007 there would still be some pre-C89 systems
around, which nginx had to support with the casts.

Now in 2022 it's been 33 years after ISO C89, so all systems that lack 'void *'
are officially dead. Nginx might still support some dead systems, though, so it
merits a bit more investigation to confirm.

Since memcmp(3) comes implemented together with the other mem...(3) functions,
checking those should help:

$ grepc ngx_memset
./src/core/ngx_string.h:89:
#define ngx_memset(buf, c, n) (void) memset(buf, c, n)
$ grepc ngx_cpymem
./src/core/ngx_string.h:97:
#define ngx_cpymem(dst, src, n) (((u_char *) ngx_memcpy(dst, src, n)) + (n))


./src/core/ngx_string.h:107:
#define ngx_cpymem(dst, src, n) (((u_char *) memcpy(dst, src, n)) + (n))
$ grepc ngx_memmove
./src/core/ngx_string.h:143:
#define ngx_memmove(dst, src, n) (void) memmove(dst, src, n)


These have no casts, so we can bet that if those work without a cast, it's
because they are implemented in the system with 'void *'. However, let's
confirm that; after all, it could be that we only call them with types
compatible with 'char *'.

`grep -rn ngx_memcpy.*data.*len` should find cases where ngx_str_t is being used
with ngx_str_t. It finds, among others, the following line:

src/stream/ngx_stream_upstream_zone_module.c:300:
ngx_memcpy(dst->server.data, src->server.data, src->server.len);

ngx_str_t uses the incompatible 'u_char *', so a 'char *' memcpy(3) wouldn't
work without a cast, so memcpy(3) definitely must be implemented with 'void *',
which implies that all mem...(3) functions, including memcmp(3), are, with a
very strong certainty, implemented following ISO C prototypes (i.e., they use
'void *').

Just to confirm, both 'dst' and 'src' are of type
'ngx_stream_upstream_rr_peer_t', which is implemented as:

$ grepc ngx_stream_upstream_rr_peer_t
./src/stream/ngx_stream_upstream_round_robin.h:17:
typedef struct ngx_stream_upstream_rr_peer_s ngx_stream_upstream_rr_peer_t;

$ grepc ngx_stream_upstream_rr_peer_s
./src/stream/ngx_stream_upstream_round_robin.h:19:
struct ngx_stream_upstream_rr_peer_s {
[...]
ngx_str_t server;

[...]
};

So that 'server' is definitely a ngx_str_t, which confirms that there's u_char*
involved.

>
> Similar casts are used by almost all nginx string macro
> definitions, since most C library functions accept "char *", while
> nginx uses "u_char *" for strings, and this results in useless
> warnings.

That's true of str...(3) functions from <string.h>, but the mem...(3) functions
from <string.h> use 'void *', so conversions are automatic, and casts are
unnecessary (after ANSI/ISO C89; previously, K&R C did use 'char *').

> While it might be seen as something better to avoid,

Even if it's a bit off-topic, I'm very curious about the reason for using
u_char. It definitely requires a lot of extra work compared to 'char *': casts,
type-safety, reviewing that code just works when workarounding/disabling the
compiler warnings. I'm guessing it was also some workaround for broken old
implementations and it has just continued like that for consistency, but am
curious if there are other better reasons. Certainly, ASCII characters behave
well (at least nowadays) independently of the signedness of char, and usually
one doesn't do arithmetic with characters in strings.

> the only alternative would be to provide wrapper functions, which
> might not be a good idea for performance and readability reasons.

Yeah, GNU always_inline or C99 inline functions could be better than macros for
performance, but those are not always available for systems supported by nginx,
so macros are the way to go here.

> On the other hand, macro definitions with casts are easily
> readable, have no impact on performance, and provide limited to
> no effect on the code quality as long as proper coding, testing
> and review process is implemented.

Yes, they are the least evil, or even the only way, so that makes sense.

>
> Summing the above, certainly removing the "ngx_" prefix is a bad
> idea.

It can make sense, especially considering that nginx has always had it.

> Removing casts from the particular macro might be
> considered, though in general it looks like a quest to prove these
> are not needed (or no longer needed) given the wide range of
> platforms nginx supports.

I don't know how far that support goes. Maybe Windows XP? Is that range even
documented?

However, as far as the quest, I think the rationale above might be enough to
prove it even without investigating specific systems. It could certainly be
possible that some system had a very brain-damaged <string.h> where memcmp(3)
used a 'char *' parameter and all other mem...(3) functions used 'void *', but I
don't think that's likely at all.

>
> Hope this helps.

For sure! Thank you very much :)

Cheers,

Alex

--
http://www.alejandro-colomar.es/
_______________________________________________
nginx-devel mailing list -- nginx-devel@nginx.org
To unsubscribe send an email to nginx-devel-leave@nginx.org
Subject Author Views Posted

[PATCH] Removed the unsafe ngx_memcmp() wrapper for memcmp(3)

Alejandro Colomar 518 November 04, 2022 11:26AM

Re: [PATCH] Removed the unsafe ngx_memcmp() wrapper for memcmp(3)

Alejandro Colomar 119 November 04, 2022 11:30AM

Re: [PATCH] Removed the unsafe ngx_memcmp() wrapper for memcmp(3)

Maxim Dounin 120 November 04, 2022 10:40PM

Re: [PATCH] Removed the unsafe ngx_memcmp() wrapper for memcmp(3)

Alejandro Colomar 107 November 06, 2022 05:52PM

Re: [PATCH] Removed the unsafe ngx_memcmp() wrapper for memcmp(3)

Maxim Dounin 105 November 08, 2022 04:52AM

Re: [PATCH] Removed the unsafe ngx_memcmp() wrapper for memcmp(3)

Alejandro Colomar 149 November 08, 2022 05:50AM

[PATCH v2] Removed the casts within ngx_memcmp()

Alejandro Colomar 98 November 08, 2022 05:58AM

Re: [PATCH v2] Removed the casts within ngx_memcmp()

Maxim Dounin 98 November 09, 2022 10:04AM

Re: [PATCH v2] Removed the casts within ngx_memcmp()

Maxim Dounin 116 November 29, 2022 11:24PM

Re: [PATCH v2] Removed the casts within ngx_memcmp()

Alex Colomar 86 November 30, 2022 05:50AM

Re: [PATCH v2] Removed the casts within ngx_memcmp()

Maxim Dounin 134 November 30, 2022 08:22PM

Re: [PATCH v2] Removed the casts within ngx_memcmp()

Sergey Kandaurov 114 November 30, 2022 07:18AM

Re: [PATCH v2] Removed the casts within ngx_memcmp()

Maxim Dounin 90 November 30, 2022 08:08PM

u_char vs char (was: [PATCH] Removed the unsafe ngx_memcmp() wrapper for memcmp(3))

Alejandro Colomar 101 November 08, 2022 06:18AM



Sorry, you do not have permission to post/reply in this forum.

Online Users

Guests: 262
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready