Mantis Bugtracker
  

Viewing Issue Advanced Details Jump to Notes ] View Simple ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0004818 [Resin] major always 10-20-11 05:19 10-20-11 14:10
Reporter ihristov View Status public  
Assigned To ferg
Priority normal Resolution fixed Platform
Status closed   OS
Projection none   OS Version
ETA none Fixed in Version 4.0.24 Product Version 4.0.23
  Product Build
Summary 0004818: WebSocket character encoding problem
Description Resin default character encoding is ISO-8859-1, however the encoding for a text data frame from the WebSocket protocol is UTF-8. I've tried using <character-encoding>utf-8</character-encoding> under <resin> and <web-app> tags (as prescribed by the documentation http://www.caucho.com/resin-4.0/admin/config-el-ref.xtp#characterencoding) [^] unfortunately neither of these actions seems to have the desired effect. I am doing this in a test environment using resin embedded (WebAppEmbed). Nevertheless I would expect the <character-encoding>utf-8</character-encoding> to work. Yet another interesting question is why resin does not detect the Content-Type header which I provide in the handshake request and automatically adapt to UTF-8?
Steps To Reproduce
Additional Information
Attached Files

- Relationships

- Notes
(0005567)
ferg
10-20-11 09:59

The encoding for WebSocket is always utf-8. It has nothing to do with the character-encoding.

If you have a bit of sample code where Resin's text isn't producing utf-8, that would be helpful.
 
(0005568)
ihristov
10-20-11 13:15

I don't want to be picky nor impolite but "<character-encoding> specifies the default character encoding for the environment." sounds to me quite important and having a lot to do with servlets character encoding stuff. Anyway, the problem can be easily reproduced by the following recipe:

1.) Implement the WebSocketListener interface to provide a websocket server-side logic. Something like:
public void onReadText(WebSocketContext context, Reader is) {
System.out.println("Msg received: "+ org.apache.commons.io.IOUtils.toString(is));
}

2.) Develop a small client to connect to the WebServlet (Unitt framework can be useful here, http://code.google.com/p/unitt/) [^]

3.) Send some text messages and see what happens. Also verify that the Reader is using ISO-8859-1.

Suspiciously enough, I can see in the ReaderStream class a small note over the read method which goes like this:
// XXX: encoding issues

I hope that this is helpful to you and puts you on the right track.
I have to check if Resin is producing text using ISO-8859-1, so far I had no troubles with server -> client communication, thus I have not verified.

Cheers,
Ivan
 
(0005569)
ferg
10-20-11 14:10

server/1o36

The issue was with the read(char[]) call in WebSocketReader.

WebSocket encoding is required to be utf-8 by the WebSocket spec. It cannot be changed by any implementation. Because the WebSocket spec has a specific encoding, default character encoding does not apply.
 

- Issue History
Date Modified Username Field Change
10-20-11 05:19 ihristov New Issue
10-20-11 05:51 ihristov Issue Monitored: ihristov
10-20-11 09:59 ferg Note Added: 0005567
10-20-11 13:15 ihristov Note Added: 0005568
10-20-11 14:10 ferg Note Added: 0005569
10-20-11 14:10 ferg Assigned To  => ferg
10-20-11 14:10 ferg Status new => closed
10-20-11 14:10 ferg Resolution open => fixed
10-20-11 14:10 ferg Fixed in Version  => 4.0.24


Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
34 total queries executed.
28 unique queries executed.
Powered by Mantis Bugtracker