Mantis - Quercus
Viewing Issue Advanced Details
5928 block always 07-27-15 23:12 07-30-15 06:43
weich  
 
normal  
new 4.0.36  
open  
none    
none  
0005928: serialize/unserialize unicode(CJK) string yields unkown characters sometimes.
I have a application runing on nginx/php5-fpm server, it can display the chines/japanese characters correctly. but after I switched to resin4.0.44 server. some page display "????" somewhere.

after long time trouble-shooting, I found there are some difference between the result generated by php5 function serialze/unserialze and the one generated by the countparts in resin4.

I guess the bug is in the class: StringBuilderValue, which get the wrong byte length when try to make a copy of a string. But I need your guys' confirmation, you are experts.

is this correct?
file : StringBuilderValue.java

  public StringBuilderValue(String s)
  {
    int len = s.length();

    _buffer = new byte[len];
    _length = len;

    for (int i = 0; i < len; i++) {
      _buffer[i] = (byte) s.charAt(i);
    }
  }


shouldn't it use "s.getBytes().length" as the following lines:

  public StringBuilderValue(String s) {
    byte[] bytes = s.getBytes();
    int len = bytes.length;

    _buffer = new byte[len];
    _length = len;

    System.arraycopy(bytes, 0, _buffer, 0, len);
}
 api.php [^] (41 bytes) 07-30-15 06:29
 test.php [^] (1,247 bytes) 07-30-15 06:41

Notes
(0006645)
weich   
07-30-15 06:43   
step to reproduce:
open the test.php in browser. click "&26356;&26032;"&65292; and you will see the serialized string at the page end.