Mantis - Quercus
Viewing Issue Advanced Details
3618 major always 08-03-09 06:40 03-11-13 18:22
tlandmann  
nam  
normal  
closed 4.0.2  
fixed  
none    
none 4.0.36  
0003618: Character Encoding incompatibility with standard PHP
The following incompatibility is also present in version 4.0.0:

Attached you find three PHP files, all of which are UTF-8 encoded:

- test4_db.php: Encapsulates database connection credentials (please edit first in order to reflect your correct database connection settings).
- test4_prep.php: Creates a new database and table in preparation to run the actual test. Should be run only once and only from standard PHP to see the results described below.
- test4.php: Runs the actual test. Should be run separately from standard PHP and from Quercus. -> Results are different in both engines - which is the problem.


Standard PHP output:


Quercus output:
&65533;&65533;&65533;&65533;&65533;&65533;


Effect: Drupal doesn't run properly if you use special characters in (for instance) node titles.

-----------------

What seems to be happening:
At some stage Quercus seems to assume to be handling ISO-8859-1 strings although:
- The sources are UTF-8.
- The database tables are UTF-8.
- In test4_db.php we explicitely instructed the DB connection to use UTF-8.

Nobody ever gives any sign there could be ISO strings, yet Quercus somewhere seems to handle them as such.

In Resin 3.2 I would've tried to change the database driver settings now. However, in Resin 4.0.0 Quercus is pre-configured (which is a great thing), and I would like to keep it this way. I think the default behaviour should be that Quercus is 100% compatible with PHP just "as is" without any special configuration. (You even claim that on your website.)

I hope I didn't miss anything on that issue. I checked the wikis and manuals for setting up Drupal as well as database configuration, previous error resports etc.

(I only found this:
http://maillist.caucho.com/pipermail/resin-interest/2008-April/002405.html [^]
It doesn't help though, since there is no manual Quercus Servlet configuration any more.)
 quercus_charencoding_incompatibility.zip [^] (1,049 bytes) 08-03-09 06:40

Notes
(0004111)
tlandmann   
08-03-09 06:49   
BTW: There is a function called 'mysql_set_charset' since PHP 5.2.3. (See http://de2.php.net/manual/en/function.mysql-set-charset.php) [^]
However Quercus doesn't implement it yet.
And even if it worked I'd only consider it a workaround because Drupal is still using the old "SET NAMES ..." strategy.
(0004116)
dl   
08-06-09 06:58   
This seems to be related to http://bugs.caucho.com/view.php?id=3413 [^]
(0005241)
dicr   
05-11-11 15:37   
"Quercus, as a java program support UTF-8 nativelly", but ...
so many years quercus can't work with UTF-8 data as PHP do :))))
(0006215)
nam   
03-11-13 18:22   
The MySQL encoding problem is fixed:

http://forum.caucho.com/showthread.php?p=36255#post36255 [^]