Welcome! Log In Create A New Profile

Advanced

[njs] Fixed RegExp.prototype.exec() with global regexp and unicode input.

Dmitry Volyntsev
October 17, 2023 08:54PM
details: https://hg.nginx.org/njs/rev/c0ff44d66ffb
branches:
changeset: 2221:c0ff44d66ffb
user: Dmitry Volyntsev <xeioex@nginx.com>
date: Tue Oct 17 17:51:39 2023 -0700
description:
Fixed RegExp.prototype.exec() with global regexp and unicode input.

Previously, when exactly 32 characters unicode string was provided and
the "lastIndex" value of "this" regexp was equal to 32 too, the
njs_string_utf8_offset() was called with invalid index argument (longer
than a size of the string). As a result njs_string_utf8_offset()
returned garbage values.

This was manifested in the following ways:
1) InternalError: pcre2_match() failed: bad offset value

2) Very slow replace calls with global regexps, for
example in expressions like: str.replace(/<re>/g).

This fixes #677 on Github.

diffstat:

src/njs_regexp.c | 11 ++++++++---
src/test/njs_unit_test.c | 6 ++++++
2 files changed, 14 insertions(+), 3 deletions(-)

diffs (37 lines):

diff -r 714fae197d83 -r c0ff44d66ffb src/njs_regexp.c
--- a/src/njs_regexp.c Mon Oct 16 18:09:37 2023 -0700
+++ b/src/njs_regexp.c Tue Oct 17 17:51:39 2023 -0700
@@ -936,9 +936,14 @@ njs_regexp_builtin_exec(njs_vm_t *vm, nj
offset = last_index;

} else {
- offset = njs_string_utf8_offset(string.start,
- string.start + string.size, last_index)
- - string.start;
+ if ((size_t) last_index < string.length) {
+ offset = njs_string_utf8_offset(string.start,
+ string.start + string.size,
+ last_index)
+ - string.start;
+ } else {
+ offset = string.size;
+ }
}

ret = njs_regexp_match(vm, &pattern->regex[type], string.start, offset,
diff -r 714fae197d83 -r c0ff44d66ffb src/test/njs_unit_test.c
--- a/src/test/njs_unit_test.c Mon Oct 16 18:09:37 2023 -0700
+++ b/src/test/njs_unit_test.c Tue Oct 17 17:51:39 2023 -0700
@@ -9261,6 +9261,12 @@ static njs_unit_test_t njs_test[] =
{ njs_str("'abc'.replaceAll(/^/g, '|$&|')"),
njs_str("||abc") },

+ { njs_str("('α'.repeat(30) + 'aa').replace(/a/g, '#')"),
+ njs_str("αααααααααααααααααααααααααααααα##") },
+
+ { njs_str("('α'.repeat(30) + 'aa').replaceAll(/a/g, '#')"),
+ njs_str("αααααααααααααααααααααααααααααα##") },
+
{ njs_str("var uri ='/u/v1/Aa/bB?type=m3u8&mt=42';"
"uri.replace(/^\\/u\\/v1\\/[^/]*\\/([^\?]*)\\?.*(mt=[^&]*).*$/, '$1|$2')"),
njs_str("bB|mt=42") },
_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
https://mailman.nginx.org/mailman/listinfo/nginx-devel
Subject Author Views Posted

[njs] Fixed RegExp.prototype.exec() with global regexp and unicode input.

Dmitry Volyntsev 178 October 17, 2023 08:54PM



Sorry, you do not have permission to post/reply in this forum.

Online Users

Guests: 265
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready