Mantis Bugtracker
  

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0005308 [Quercus] minor always 12-09-12 18:17 01-11-13 14:51
Reporter nam View Status public  
Assigned To nam
Priority normal Resolution fixed  
Status closed   Product Version
Summary 0005308: QuercusScriptEngine needs to output unicode correctly
Description (rep by woodle)

http://forum.caucho.com/showthread.php?t=29234 [^]
Additional Information import java.io.StringWriter;

import javax.script.ScriptContext;
import javax.script.ScriptEngine;
import com.caucho.quercus.QuercusEngine;
import com.caucho.quercus.script.QuercusScriptEngine;
import com.caucho.quercus.script.QuercusScriptEngineFactory;


public class TestUtf8 {

    public static void main(String[] args) throws Exception {

        QuercusScriptEngineFactory factory = new QuercusScriptEngineFactory();
        ScriptEngine phpEngine = factory.getScriptEngine();
        ((QuercusScriptEngine) phpEngine).getQuercus().setIni("unicode.semantics", "on");

        StringWriter writer = new StringWriter();
        ScriptContext context = phpEngine.getContext();
        context.setWriter(writer);

        String code = "<?php print 'Umlšut'; return 'Umlšut'; ?>";

        Object o = phpEngine.eval(code);

        System.out.println("\n******\n");
        System.out.println("code=[" + code + "]");
        System.out.println("o=[" + o + "]");

        String output = writer.getBuffer().toString();
        System.out.println("output=[" + output + "]");

    }

}
Attached Files

- Relationships

- Notes
(0006109)
nam
12-10-12 09:47

php/2127

Fixed for 4.0.33. Also, you need to call QuercusContext.setUnicodeSemantics() instead.

((QuercusScriptEngine) phpEngine).getQuercus().setUnicodeSemantics(true);
 
(0006152)
nam
01-10-13 10:35

Issues still exists if test case is a standalone Java class (not within jsp inside test harness).
 
(0006153)
nam
01-10-13 10:40

For 4.0.34, QuercusScriptEngine will use "utf-8" script encoding and unicode.semantics=on by default. So you won't need to do the following anymore:

<code>
Quercus quercus = new Quercus();
quercus.setUnicodeSemantics(true);
quercus.setIni("unicode.semantics", "on");
quercus.init();
quercus.start();

QuercusScriptEngine phpEngine = new QuercusScriptEngine(new QuercusScriptEngineFactory(), quercus);
</code>
 
(0006154)
nam
01-10-13 11:05
edited on: 01-11-13 14:50

I stand corrected. unicode.semantics will still be off for 4.0.34, (utf-8 will be the default everywhere). unicode.semantics=on makes Quercus behave like PHP6, but PHP6 will likely cause compatibility problems with old PHP code. You don't need to use PHP6 for UTF-8.

Edited: unicode.semantics will be ON for 4.0.34.

 
(0006157)
nam
01-11-13 14:51

php/2127
php/2128

Fixed for 4.0.34. To verify, please use subversion to check out our sources.

The following are now set by default: unicode.semantics=on and scriptEncoding=utf8. And QuercusScriptEngine now returns Quercus value types (e.g. return type of ScriptEngine.eval() is Value).

import java.io.*;
import javax.script.*;

import com.caucho.quercus.env.*;
import com.caucho.quercus.script.*;

public class Test
{
  public static void main(String[] args)
    throws Exception
  {
    boolean isUnicodeSemantics = true;

    QuercusScriptEngine phpEngine
      = new QuercusScriptEngine(isUnicodeSemantics);

    StringWriter writer = new StringWriter();
    ScriptContext context = phpEngine.getContext();
    context.setWriter(writer);

    String a0 = "š";
    String a1 = "\u00e4";
    String code = "<?php print '" + a1 + "'; return '" + a1 + "'; ?>&
quot;;

    Object obj = phpEngine.eval(code);
    String returnValue = obj.toString();
      
    System.out.println("a0_umlaut : " + a0 + ",length=" + a0.length() + ",h
ex(0)=" + Integer.toHexString(a0.charAt(0)));
    System.out.println("a1_umlaut : " + a1 + ",length=" + a1.length() + ",h
ex(0)=" + Integer.toHexString(a1.charAt(0)));

    System.out.println("code      : " + code);
    System.out.println("return    : " + returnValue + ",length=" + returnValue.l
ength() + ",hex(0)=" + Integer.toHexString(returnValue.charAt(0)));

    String output = writer.getBuffer().toString();
    System.out.println("output    : " + output + ",length=" + output.length() + 
",hex(0)=" + Integer.toHexString(output.charAt(0)));
  }
}
 

- Issue History
Date Modified Username Field Change
12-09-12 18:17 nam New Issue
12-10-12 09:46 nam Status new => assigned
12-10-12 09:46 nam Assigned To  => nam
12-10-12 09:47 nam Status assigned => closed
12-10-12 09:47 nam Note Added: 0006109
12-10-12 09:47 nam Resolution open => fixed
01-10-13 02:55 ngoc Issue Monitored: ngoc
01-10-13 10:35 nam Note Added: 0006152
01-10-13 10:35 nam Status closed => assigned
01-10-13 10:40 nam Note Added: 0006153
01-10-13 11:05 nam Note Added: 0006154
01-11-13 14:50 nam Note Edited: 0006154
01-11-13 14:51 nam Status assigned => closed
01-11-13 14:51 nam Note Added: 0006157
01-11-13 14:51 nam Fixed in Version  => 4.0.34


Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
39 total queries executed.
30 unique queries executed.
Powered by Mantis Bugtracker