Bug in ngx_utf8_decode ?
August 18, 2013 04:08AM
In the source code file ngx_string.c. There is a piece of code in the function ngx_utf8_decode as follow:

} else if (u >= 0xc2) {
//leading byte = 110xxxxx,
u &= 0x1f;
valid = 0x7f;
len = 1;
}else

as i know, the leading byte of UTF-8 should be one of kind as follow :
on byte encoding : leading byte = 0xxxxxx
two bytes encoding : leading byte = 110xxxxx
three bytes encoding : leading byte = 1110xxxx
four bytes encoding : leading byte = 11110xxx
so, the condition u >=0xc2 should be u>=0xc0?

It is a bug ?



Edited 1 time(s). Last edit at 08/18/2013 04:11AM by zengkui.
Sorry, only registered users may post in this forum.

Click here to login

Online Users

Guests: 168
Record Number of Users: 8 on April 13, 2023
Record Number of Guests: 421 on December 02, 2018
Powered by nginx      Powered by FreeBSD      PHP Powered      Powered by MariaDB      ipv6 ready