Welcome! Log In Create A New Profile

Advanced

[PATCH] Escape strings produced by the `%V` format specification

Anonymous User
January 03, 2017 01:36AM
# HG changeset patch
# User Joshua Spence <josh@joshuaspence.com>
# Date 1483420178 -39600
# Tue Jan 03 16:09:38 2017 +1100
# Node ID 8e734d01f026c4346b46006ee9b558031439be61
# Parent a42afc225e98afe1f7c3b428c81c8af7eba362c0
Escape strings produced by the `%V` format specification

Currently nginx error logs loosely adhere to the following pattern:

```
%{TIMESTAMP} [%{SEVERITY}] %{PROCESS_ID}#%{THREAD_ID}:( *%{CONNECTION_ID})?%{MESSAGE}(, %{KEY}: "%{VALUE}")*
```

In our environment we are using Filebeat to ship nginx error logs into Logstash. Our Logstash filter for parsing nginx error logs is as follows:

```
filter {
grok {
match => {
"message" => "(?<timestamp>%{YEAR}/%{MONTHNUM2}/%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:process_id:int}#%{NONNEGINT:thread_id:int}:(?: \*%{NONNEGINT:connection_id:int})? (?<message>[^,]*)(?:, %{GREEDYDATA:nginx_keyvalue_data})?"
}
overwrite => ["message"]
}

date {
match => ["timestamp", "yyyy/MM/dd HH:mm:ss"]
remove_field => ["timestamp"]
}

kv {
field_split => ", "
remove_field => "nginx_keyvalue_data"
source => "nginx_keyvalue_data"
value_split => ": "
}
}
```

The problem, however, is that it is difficult to unambigiously parse the key-value data from the nginx error log format because of a lack of data sanitization. Specifically, consider the following entry from our nginx error logs:

```
2016/11/15 03:45:51 [error] 23770#23770: *21535000 open() "/REDACTED/+"BauGutachter"++41+2016+.ch&tbo=1" failed (2: No such file or directory), client: REDACTED, server: REDACTED, request: "GET /job/+%22BauGutachter%22++41+2016+.ch&tbo=1 HTTP/1.1", host: "REDACTED", referrer: "https://www.google.com.au/search?hl=en&q="BauGutachter" +41 2016 .ch&num=100&start=0"
```

Specifically, the problem in the above sample is the `referrer` field, the value of which contains double quotes. I think that it makes sense to escape `\` and `"` characters before they are written to the error log.

diff -r a42afc225e98 -r 8e734d01f026 src/core/ngx_string.c
--- a/src/core/ngx_string.c Tue Dec 27 17:23:08 2016 +0300
+++ b/src/core/ngx_string.c Tue Jan 03 16:09:38 2017 +1100
@@ -15,7 +15,7 @@
const u_char *basis, ngx_uint_t padding);
static ngx_int_t ngx_decode_base64_internal(ngx_str_t *dst, ngx_str_t *src,
const u_char *basis);
-
+static uintptr_t ngx_escape_string(u_char *dst, u_char *src, size_t size);

void
ngx_strlow(u_char *dst, u_char *src, size_t n)
@@ -235,7 +235,7 @@
v = va_arg(args, ngx_str_t *);

len = ngx_min(((size_t) (last - buf)), v->len);
- buf = ngx_cpymem(buf, v->data, len);
+ buf = (u_char *) ngx_escape_string(buf, v->data, len);
fmt++;

continue;
@@ -1844,6 +1844,43 @@
}


+static uintptr_t
+ngx_escape_string(u_char *dst, u_char *src, size_t size)
+{
+ u_char ch;
+ ngx_uint_t len;
+
+ if (dst == NULL) {
+ len = 0;
+
+ while (size) {
+ ch = *src++;
+
+ if (ch == '\\' || ch == '"') {
+ len++;
+ }
+
+ size--;
+ }
+
+ return (uintptr_t) len;
+ }
+
+ while (size) {
+ ch = *src++;
+
+ if (ch == '\\' || ch == '"') {
+ *dst++ = '\\';
+ }
+
+ *dst++ = ch;
+ size--;
+ }
+
+ return (uintptr_t) dst;
+}
+
+
void
ngx_str_rbtree_insert_value(ngx_rbtree_node_t *temp,
ngx_rbtree_node_t *node, ngx_rbtree_node_t *sentinel)
_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx-devel
Subject Author Views Posted

[PATCH] Escape strings produced by the `%V` format specification

Anonymous User 1233 January 03, 2017 01:36AM

Re: [PATCH] Escape strings produced by the `%V` format specification

Joshua Spence 361 January 03, 2017 01:40AM

Re: [PATCH] Escape strings produced by the `%V` format specification

Maxim Dounin 389 January 04, 2017 07:20PM



Sorry, you do not have permission to post/reply in this forum.

Online Users

Guests: 229
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready