Mantis - Quercus
Viewing Issue Advanced Details
3296 minor always 01-24-09 16:36 01-26-09 12:14
koreth  
nam  
normal  
closed 4.0.0  
fixed  
none    
none 4.0.0  
0003296: preg_replace throws exception on malformed UTF-8 when "u" modifier is used
<?php
$bad_utf8 = "abc\xf0";
print strlen(preg_replace("/[^\pL]/u", "", $bad_utf8));

Regular PHP prints "0" (the preg_replace returns an empty string). Quercus throws an exception:

com.caucho.quercus.QuercusRuntimeException: bad UTF-8 sequence, saw EOF
    at com.caucho.quercus.lib.regexp.Regexp.fromUtf8(Regexp.java:267)
    at com.caucho.quercus.lib.regexp.Regexp.convertSubject(Regexp.java:182)
    at com.caucho.quercus.lib.regexp.RegexpState.<init>(RegexpState.java:79)
    at com.caucho.quercus.lib.regexp.CauchoRegexpModule.pregReplaceString(CauchoRegexpModule.java:769)
    at com.caucho.quercus.lib.regexp.CauchoRegexpModule.pregReplace(CauchoRegexpModule.java:678)
    at com.caucho.quercus.lib.regexp.CauchoRegexpModule.preg_replace(CauchoRegexpModule.java:614)
    at com.caucho.quercus.lib.regexp.RegexpModule.preg_replace(RegexpModule.java:175)

Workaround is to sanitize the input before calling preg_replace.

Notes
(0003779)
nam   
01-26-09 12:14   
php/153k