Mantis - Quercus
Viewing Issue Advanced Details
1193 minor always 06-12-06 12:18 06-28-06 19:37
artem  
ferg  
urgent  
closed 3.0.20  
fixed  
none    
none 3.0.20  
0001193: odd behavior in UTF-8 encoding/decoding
http://localhost:8080/test.php?i=%D0%90 [^]

%D0%90 - is UTF-8 encoded Russian A (&0001040;)

It shows following text in browser:

&0000144;
2

When I check HTML source code it shows:-

<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
&0000144;&0000144;
2


4 charaters that that encodes of 2 characters:-

test.php - source code

<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<?

echo $_GET['i'];
echo "
";
echo strlen($_GET['i']);
echo "
";

?>
%D0%90 is ascii &0000144;

So PHP parce URL as ascii and return as UTF-8.

It is incorrect as form on utf-8 encoded page will post utf-8 encode URLs.

PHP must parce URL and return page code in same encoding.


It is default behavior of snapshot downloaded from 02/06/2006

Notes
(0001352)
ferg   
06-28-06 19:37   
php/0802