Mantis Bugtracker

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0002606 [Quercus] major always 04-15-08 07:32 05-29-08 15:55
Reporter sgraf View Status public  
Assigned To nam
Priority normal Resolution fixed  
Status closed   Product Version 3.1.6
Summary 0002606: ResultSet columns of type LONGVARCHAR do not hande unicode characters correctly
Description Problem found with latest SVN source build using Tomcat 6.0.16 and MySQL 5.

JDBC's LONGVARCHAR are mysql's smalltext, mediumtext, text types.

com.caucho.quercus.lib.db.JdbcResultResource.getColumnValue() doesn't handle LONGVARCHAR columns correctly. At the moment it will use a binary input string and StringBuilder.append(byte[],int, int) which doesn't parse multibyte characters correctly.

A possible solution is to use rs.getCharacterStream() to obtain a multibyte compatible reader and use StringBuilder.append(Reader).

A patch implementing that solution is attached.

Additional Information While testing the beforementioned fix I have discovered an additioal issue in com.caucho.quercus.env.StringBuilderValue which I will report in a separate bug-report.
Attached Files [^] (1,320 bytes) 04-15-08 07:32

- Relationships

- Notes
04-15-08 07:44

Related StringBuilder Bug Report is here: 0002607
04-24-08 21:14

I don't think MySQL has the LONGVARCHAR type :). But good catch anyways because this would certainly affect other databases.

PHP5 has byte strings, so interpretation of strings is up to the user application. So the current code is fine. However, there is a new unicode string type where we would want to read LONGVARCHAR as Java characters instead of bytes. Our Env.createUnicodeBuilder() detects when we are in PHP5 or PHP6 mode and returns the appropriate builder. So your patch is the correct thing to do for when we are in PHP6 mode.

To do: make a test case for Postgres/Oracle
04-25-08 11:04
edited on: 04-25-08 11:05

Yes I discovered this while using the unicode mode (or php6 mode as you call it).

From what I have experienced and seen by debugging quercus in eclipse, MysQL JDBC Driver considers the mysql TEXT and LONGTEXT types as LONGVARCHAR.

If you want to reproduce the bug I encountered, create mysql table containing a utf-8 encoded VARCHAR column and a utf-8 encoded TEXT colum. You will see that the text retrieved from the VARCHAR column is encoded correctly but the text comming from the TEXT column isn't.

05-29-08 15:55


- Issue History
Date Modified Username Field Change
04-15-08 07:32 sgraf New Issue
04-15-08 07:32 sgraf File Added:
04-15-08 07:44 sgraf Note Added: 0002974
04-24-08 21:14 nam Note Added: 0003013
04-25-08 11:04 sgraf Note Added: 0003015
04-25-08 11:05 sgraf Note Edited: 0003015
05-29-08 15:55 ferg Note Added: 0003115
05-29-08 15:55 ferg Assigned To  => nam
05-29-08 15:55 ferg Status new => closed
05-29-08 15:55 ferg Resolution open => fixed
05-29-08 15:55 ferg Fixed in Version  => 3.2.0

Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
39 total queries executed.
31 unique queries executed.
Powered by Mantis Bugtracker