Hello Maxim!
On 5/21/23 15:09, Maxim Dounin wrote:
> Hello!
>
> On Sun, May 21, 2023 at 11:31:32AM +0200, Alejandro Colomar wrote:
>
>> On 5/21/23 03:42, Maxim Dounin wrote:
>>> # HG changeset patch
>>> # User Maxim Dounin <mdounin@mdounin.ru>
>>> # Date 1684633125 -10800
>>> # Sun May 21 04:38:45 2023 +0300
>>> # Node ID 68fa4b86ed46138dd1a8fcf2cfd80206de068bec
>>> # Parent 235d482ef6bc8c40a956b2413865d42c94e0fc05
>>> QUIC: better sockaddr initialization.
>>>
>>> The qsock->sockaddr field is a ngx_sockaddr_t union, and therefore can hold
>>> any sockaddr (and union members, such qsock->sockaddr.sockaddr, can be used
>>> to access appropriate variant of the sockaddr). It is better to set it via
>>> qsock->sockaddr itself though, and not qsock->sockaddr.sockaddr, so static
>>> analyzers won't complain about out-of-bounds access.
>>
>> Correct. The previous code was UB, due to memcpy(3) writing to the
>> 'struct sockaddr' member. By writing to sockaddr, you were only
>> allowed to alias via other members the sa_family_t field, but no
>> others.
>
> Well, not really. There is no UB in the previous code, it simply
> uses a valid (void *) address to fill the sockaddr (and does so
> without breaking strict aliasing rules).
While the data being written was correctly written via memcpy(3),
you wouldn't be allowed to access it later as anything that is
not 'struct sockaddr'. For example, the following is a
strict-aliasing violation:
struct s { int a; int b; };
struct t { int a; };
union u { struct s s; struct t t; };
struct s x = {42, 42};
union u y;
int z;
memcpy(&y.t, &x, sizeof(x)); // This is fine
// We created an object of type 'struct t' in the union.
// Unions allow aliasing, so we're allowed to reinterpret
// that object as a 'struct s' via the other member.
z = y.s.a; // This is fine.
// But we're not allowed to reinterpret bytes that are
// officially uninitialized (even though we know they are
// initialized).
z = y.s.b; // UB here.
The reason for the UB is that the compiler is free to assume
that since you wrote to the struct t member, the write can't
possibly write to the second member of the struct (even if
the size passed to memcpy(3) is larger than that). In other
words, the compiler may assume that anything past
sizeof(struct t) is uninitialized. Also, writing past an
object is very dubious, even via memcpy(3), even if you know
that the storage is there (thanks to the union). It's just
safer writing to the union itself, or to the field that has
the correct object type.
Although the trailing flexible array in sockaddr might be
enough to signal that any amount of bytes may have been
written, since it's defined as:
struct sockaddr {
sa_family_t sa_family; /* Address family */
char sa_data[]; /* Socket address */
};
>
> But the code might confuse static analyzers, since they have no
> way to know that the whole ngx_sockaddr_t is being set, and not
> just the (struct sockaddr) union member which is referenced.
And in a similar way, it would be worrying (and possible) to
confuse an optimizing compiler. :)
I happened to get an internal compiler error earlier this
month with GCC 13's -fanalyzer and some dubious use of
flexible array members, with code that was bordering UB.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109802
Cheers,
Alex
--
http://www.alejandro-colomar.es/
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
https://mailman.nginx.org/mailman/listinfo/nginx-devel