Mantis - Quercus
Viewing Issue Advanced Details
5928 block always 07-27-15 23:12 07-30-15 06:43
new 4.0.36  
0005928: serialize/unserialize unicode(CJK) string yields unkown characters sometimes.
I have a application runing on nginx/php5-fpm server, it can display the chines/japanese characters correctly. but after I switched to resin4.0.44 server. some page display "????" somewhere.

after long time trouble-shooting, I found there are some difference between the result generated by php5 function serialze/unserialze and the one generated by the countparts in resin4.

I guess the bug is in the class: StringBuilderValue, which get the wrong byte length when try to make a copy of a string. But I need your guys' confirmation, you are experts.

is this correct?
file :

  public StringBuilderValue(String s)
    int len = s.length();

    _buffer = new byte[len];
    _length = len;

    for (int i = 0; i < len; i++) {
      _buffer[i] = (byte) s.charAt(i);

shouldn't it use "s.getBytes().length" as the following lines:

  public StringBuilderValue(String s) {
    byte[] bytes = s.getBytes();
    int len = bytes.length;

    _buffer = new byte[len];
    _length = len;

    System.arraycopy(bytes, 0, _buffer, 0, len);
 api.php [^] (41 bytes) 07-30-15 06:29
 test.php [^] (1,247 bytes) 07-30-15 06:41

07-30-15 06:43   
step to reproduce:
open the test.php in browser. click "&26356;&26032;"&65292; and you will see the serialized string at the page end.