Mantis Bugtracker
  

Viewing Issue Advanced Details Jump to Notes ] View Simple ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0003618 [Quercus] major always 08-03-09 06:40 03-11-13 18:22
Reporter tlandmann View Status public  
Assigned To nam
Priority normal Resolution fixed Platform
Status closed   OS
Projection none   OS Version
ETA none Fixed in Version 4.0.36 Product Version 4.0.2
  Product Build
Summary 0003618: Character Encoding incompatibility with standard PHP
Description The following incompatibility is also present in version 4.0.0:

Attached you find three PHP files, all of which are UTF-8 encoded:

- test4_db.php: Encapsulates database connection credentials (please edit first in order to reflect your correct database connection settings).
- test4_prep.php: Creates a new database and table in preparation to run the actual test. Should be run only once and only from standard PHP to see the results described below.
- test4.php: Runs the actual test. Should be run separately from standard PHP and from Quercus. -> Results are different in both engines - which is the problem.


Standard PHP output:


Quercus output:
&65533;&65533;&65533;&65533;&65533;&65533;


Effect: Drupal doesn't run properly if you use special characters in (for instance) node titles.

-----------------

What seems to be happening:
At some stage Quercus seems to assume to be handling ISO-8859-1 strings although:
- The sources are UTF-8.
- The database tables are UTF-8.
- In test4_db.php we explicitely instructed the DB connection to use UTF-8.

Nobody ever gives any sign there could be ISO strings, yet Quercus somewhere seems to handle them as such.

In Resin 3.2 I would've tried to change the database driver settings now. However, in Resin 4.0.0 Quercus is pre-configured (which is a great thing), and I would like to keep it this way. I think the default behaviour should be that Quercus is 100% compatible with PHP just "as is" without any special configuration. (You even claim that on your website.)

I hope I didn't miss anything on that issue. I checked the wikis and manuals for setting up Drupal as well as database configuration, previous error resports etc.

(I only found this:
http://maillist.caucho.com/pipermail/resin-interest/2008-April/002405.html [^]
It doesn't help though, since there is no manual Quercus Servlet configuration any more.)
Steps To Reproduce
Additional Information
Attached Files  quercus_charencoding_incompatibility.zip [^] (1,049 bytes) 08-03-09 06:40

- Relationships

- Notes
(0004111)
tlandmann
08-03-09 06:49

BTW: There is a function called 'mysql_set_charset' since PHP 5.2.3. (See http://de2.php.net/manual/en/function.mysql-set-charset.php) [^]
However Quercus doesn't implement it yet.
And even if it worked I'd only consider it a workaround because Drupal is still using the old "SET NAMES ..." strategy.
 
(0004116)
dl
08-06-09 06:58

This seems to be related to http://bugs.caucho.com/view.php?id=3413 [^]
 
(0005241)
dicr
05-11-11 15:37

"Quercus, as a java program support UTF-8 nativelly", but ...
so many years quercus can't work with UTF-8 data as PHP do :))))
 
(0006215)
nam
03-11-13 18:22

The MySQL encoding problem is fixed:

http://forum.caucho.com/showthread.php?p=36255#post36255 [^]
 

- Issue History
Date Modified Username Field Change
08-03-09 06:40 tlandmann New Issue
08-03-09 06:40 tlandmann File Added: quercus_charencoding_incompatibility.zip
08-03-09 06:49 tlandmann Note Added: 0004111
08-06-09 06:58 dl Note Added: 0004116
05-11-11 15:36 dicr Issue Monitored: dicr
05-11-11 15:37 dicr Note Added: 0005241
03-11-13 16:51 nam Status new => assigned
03-11-13 16:51 nam Assigned To  => nam
03-11-13 18:22 nam Status assigned => closed
03-11-13 18:22 nam Note Added: 0006215
03-11-13 18:22 nam Resolution open => fixed
03-11-13 18:22 nam Fixed in Version  => 4.0.36


Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
40 total queries executed.
32 unique queries executed.
Powered by Mantis Bugtracker