Welcome! Log In Create A New Profile

Advanced

nginx modules and multiple escape_uri / unescape_uri definitions

Markus Linnala
November 07, 2011 02:36PM
I ran into some problems with uri encoding. Problem is multiple and
different implementations of escaping and unescaping uri. And
because different programming language libraries use different ways
of encoding.



unescape_uri:

https://github.com/phusion/nginx/blob/master/src/core/ngx_string.c#L1336

These seems to be the same with each other. They differ from core
one by unescapeing '+' to ' '. I guess nginx conforms RFC 3986 and
external modules tries to be compatible with other programs like
PHP, .NET, Java.

https://github.com/agentzh/set-misc-nginx-module/blob/master/src/ngx_http_set_unescape_uri.c#L46

https://github.com/chaoslawful/lua-nginx-module/blob/master/src/ngx_http_lua_util.c#L1328

PHP encodes ' ' to '+' with urlencode
http://php.net/manual/en/function.urlencode.php

..NET Framework 4 encode ' ' to '+' with HttpUtility.UrlEncode
http://msdn.microsoft.com/en-us/library/4fkewx0t.aspx

Java 5-7 at least encode ' ' to '+'
http://download.oracle.com/javase/7/docs/api/java/net/URLEncoder.html

There is way to consolidate of unescape_uri. Add new type and then
add version checks on modules and use core version with proper type.
And extend modules to handle different types. Patch for nginx
attached. 0001-application-x-www-form-urlencoded-compatible-mode.patch




escape_uri:

I guess there was need for different implementations, but it might
be possible to consolidate external modules after this:

http://trac.nginx.org/nginx/changeset/4193/nginx

https://github.com/phusion/nginx/blob/master/src/core/ngx_string.c#L1505

These seems to be be the same. They differ from core somewhat. Core
version of uri_component almost the same as uri on modules
(!$*(),@`). Also args differ slightly (;&).

https://github.com/agentzh/set-misc-nginx-module/blob/master/src/ngx_http_set_escape_uri.c#L57

https://github.com/chaoslawful/lua-nginx-module/blob/master/src/ngx_http_lua_util.c#L1179

Could it be possible for set-misc and lua modules to use nginx core
version of uri_component and args?


This is almost the same as nginx core version of uri_component.
Couple of differences ( *~) and hex is uppercase. Commit message
hints that new encoding was needed for java.

https://github.com/yaoweibin/memc-nginx-module/blob/master/src/ngx_http_memc_request.c#L8

I guess this is for special need and not needed to consider further.

--
Markus Linnala, Chief Systems Architect
Cybercom Finland
Pakkahuoneenaukio 2 A; 33100 Tampere
Mobile +358 40 5919 735
Markus.Linnala@cybercom.com

www.cybercom.fi | www.cybercom.com

From ca8aab7ac68c0d58ab7e7ac736cf6d7d21e80b67 Mon Sep 17 00:00:00 2001
From: Markus Linnala <Markus.Linnala@cybercom.com>
Date: Mon, 7 Nov 2011 20:49:15 +0200
Subject: [PATCH] application/x-www-form-urlencoded compatible mode

Unescape '+' to ' '. Needed, if encoding is application/x-www-form-urlencoded compatible.
---
src/core/ngx_string.c | 7 +++++++
src/core/ngx_string.h | 1 +
2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/src/core/ngx_string.c b/src/core/ngx_string.c
index 29f8e0d..39add87 100644
--- a/src/core/ngx_string.c
+++ b/src/core/ngx_string.c
@@ -1536,6 +1536,13 @@ ngx_unescape_uri(u_char **dst, u_char **src, size_t size, ngx_uint_t type)
break;
}

+ if (ch == '+'
+ && (type & (NGX_UNESCAPE_FORM_URL)))
+ {
+ *d++ = ' ';
+ break;
+ }
+
*d++ = ch;
break;

diff --git a/src/core/ngx_string.h b/src/core/ngx_string.h
index 2b9c59a..6158c40 100644
--- a/src/core/ngx_string.h
+++ b/src/core/ngx_string.h
@@ -199,6 +199,7 @@ u_char *ngx_utf8_cpystrn(u_char *dst, u_char *src, size_t n, size_t len);

#define NGX_UNESCAPE_URI 1
#define NGX_UNESCAPE_REDIRECT 2
+#define NGX_UNESCAPE_FORM_URL 4

uintptr_t ngx_escape_uri(u_char *dst, u_char *src, size_t size,
ngx_uint_t type);
--
1.7.6.4


_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx-devel
Subject Author Views Posted

nginx modules and multiple escape_uri / unescape_uri definitions

Markus Linnala 2236 November 07, 2011 02:36PM



Sorry, you do not have permission to post/reply in this forum.

Online Users

Guests: 234
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready