On Wed, 20 Oct 2010 21:57:32 -0400, helen <nginx-forum@nginx.us> wrote:
> On Wed, 20 Oct 2010 21:23:46 -0400, Pierre-Marie Baty wrote:
>
>> When the URL is Latin-1 encoded, the request sent is : GET
>> /%e9t%e9-2008.jpg ----> nginx resolves this to "été-2008.jpg", the
> file
>> is served, OK
>> When the URL is UTF-8 encoded, the request sent is : GET
>> /%C3%A9t%C3%A9-2008.jpg ----> nginx resolves this to
> "été-2008.jpg",
>> and the file is not served. (file not found)
>
> I only spent about 5 minutes looking for this, so I could be totally
> wrong:
>
> In 0.8.53, src/http/ngx_http_parse.c:1220 appears to be the start of the
> relevant code. On a quick scan, it looks like the percent-decoding is
> hardcoded. (case sw_quoted, followed by case sw_quoted_second, inside a
> switch loop)
Sorry to reply to my own post, but it looks like I am wrong; that looks like where %xx is decoded only (duh). I am still following the chain to where this is passed to the OS, and I don't have time to look further now.
helen