Mantis Bugtracker
  

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0003618 [Quercus] major always 08-03-09 06:40 03-11-13 18:22
Reporter tlandmann View Status public  
Assigned To nam
Priority normal Resolution fixed  
Status closed   Product Version 4.0.2
Summary 0003618: Character Encoding incompatibility with standard PHP
Description The following incompatibility is also present in version 4.0.0:

Attached you find three PHP files, all of which are UTF-8 encoded:

- test4_db.php: Encapsulates database connection credentials (please edit first in order to reflect your correct database connection settings).
- test4_prep.php: Creates a new database and table in preparation to run the actual test. Should be run only once and only from standard PHP to see the results described below.
- test4.php: Runs the actual test. Should be run separately from standard PHP and from Quercus. -> Results are different in both engines - which is the problem.


Standard PHP output:


Quercus output:
&65533;&65533;&65533;&65533;&65533;&65533;


Effect: Drupal doesn't run properly if you use special characters in (for instance) node titles.

-----------------

What seems to be happening:
At some stage Quercus seems to assume to be handling ISO-8859-1 strings although:
- The sources are UTF-8.
- The database tables are UTF-8.
- In test4_db.php we explicitely instructed the DB connection to use UTF-8.

Nobody ever gives any sign there could be ISO strings, yet Quercus somewhere seems to handle them as such.

In Resin 3.2 I would've tried to change the database driver settings now. However, in Resin 4.0.0 Quercus is pre-configured (which is a great thing), and I would like to keep it this way. I think the default behaviour should be that Quercus is 100% compatible with PHP just "as is" without any special configuration. (You even claim that on your website.)

I hope I didn't miss anything on that issue. I checked the wikis and manuals for setting up Drupal as well as database configuration, previous error resports etc.

(I only found this:
http://maillist.caucho.com/pipermail/resin-interest/2008-April/002405.html [^]
It doesn't help though, since there is no manual Quercus Servlet configuration any more.)
Additional Information
Attached Files  quercus_charencoding_incompatibility.zip [^] (1,049 bytes) 08-03-09 06:40

- Relationships

- Notes
(0004111)
tlandmann
08-03-09 06:49

BTW: There is a function called 'mysql_set_charset' since PHP 5.2.3. (See http://de2.php.net/manual/en/function.mysql-set-charset.php) [^]
However Quercus doesn't implement it yet.
And even if it worked I'd only consider it a workaround because Drupal is still using the old "SET NAMES ..." strategy.
 
(0004116)
dl
08-06-09 06:58

This seems to be related to http://bugs.caucho.com/view.php?id=3413 [^]
 
(0005241)
dicr
05-11-11 15:37

"Quercus, as a java program support UTF-8 nativelly", but ...
so many years quercus can't work with UTF-8 data as PHP do :))))
 
(0006215)
nam
03-11-13 18:22

The MySQL encoding problem is fixed:

http://forum.caucho.com/showthread.php?p=36255#post36255 [^]
 

- Issue History
Date Modified Username Field Change
08-03-09 06:40 tlandmann New Issue
08-03-09 06:40 tlandmann File Added: quercus_charencoding_incompatibility.zip
08-03-09 06:49 tlandmann Note Added: 0004111
08-06-09 06:58 dl Note Added: 0004116
05-11-11 15:36 dicr Issue Monitored: dicr
05-11-11 15:37 dicr Note Added: 0005241
03-11-13 16:51 nam Status new => assigned
03-11-13 16:51 nam Assigned To  => nam
03-11-13 18:22 nam Status assigned => closed
03-11-13 18:22 nam Note Added: 0006215
03-11-13 18:22 nam Resolution open => fixed
03-11-13 18:22 nam Fixed in Version  => 4.0.36


Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
40 total queries executed.
32 unique queries executed.
Powered by Mantis Bugtracker