0003618: Character Encoding incompatibility with standard PHP
The following incompatibility is also present in version 4.0.0:

Attached you find three PHP files, all of which are UTF-8 encoded:

- test4_db.php: Encapsulates database connection credentials (please edit first in order to reflect your correct database connection settings).
- test4_prep.php: Creates a new database and table in preparation to run the actual test. Should be run only once and only from standard PHP to see the results described below.
- test4.php: Runs the actual test. Should be run separately from standard PHP and from Quercus. -> Results are different in both engines - which is the problem.

Standard PHP output:

Quercus output:

Effect: Drupal doesn't run properly if you use special characters in (for instance) node titles.


What seems to be happening:
At some stage Quercus seems to assume to be handling ISO-8859-1 strings although:
- The sources are UTF-8.
- The database tables are UTF-8.
- In test4_db.php we explicitely instructed the DB connection to use UTF-8.

Nobody ever gives any sign there could be ISO strings, yet Quercus somewhere seems to handle them as such.

In Resin 3.2 I would've tried to change the database driver settings now. However, in Resin 4.0.0 Quercus is pre-configured (which is a great thing), and I would like to keep it this way. I think the default behaviour should be that Quercus is 100% compatible with PHP just "as is" without any special configuration. (You even claim that on your website.)

I hope I didn't miss anything on that issue. I checked the wikis and manuals for setting up Drupal as well as database configuration, previous error resports etc.

(I only found this: [^]
It doesn't help though, since there is no manual Quercus Servlet configuration any more.) [^] (1,049 bytes) 08-03-09 06:40

08-03-09 06:49   
BTW: There is a function called 'mysql_set_charset' since PHP 5.2.3. (See [^]
However Quercus doesn't implement it yet.
And even if it worked I'd only consider it a workaround because Drupal is still using the old "SET NAMES ..." strategy.
08-06-09 06:58   
This seems to be related to [^]
05-11-11 15:37   
"Quercus, as a java program support UTF-8 nativelly", but ...
so many years quercus can't work with UTF-8 data as PHP do :))))
03-11-13 18:22   
The MySQL encoding problem is fixed: [^]