Mantis Bugtracker
  

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0004818 [Resin] major always 10-20-11 05:19 10-20-11 14:10
Reporter ihristov View Status public  
Assigned To ferg
Priority normal Resolution fixed  
Status closed   Product Version 4.0.23
Summary 0004818: WebSocket character encoding problem
Description Resin default character encoding is ISO-8859-1, however the encoding for a text data frame from the WebSocket protocol is UTF-8. I've tried using <character-encoding>utf-8</character-encoding> under <resin> and <web-app> tags (as prescribed by the documentation http://www.caucho.com/resin-4.0/admin/config-el-ref.xtp#characterencoding) [^] unfortunately neither of these actions seems to have the desired effect. I am doing this in a test environment using resin embedded (WebAppEmbed). Nevertheless I would expect the <character-encoding>utf-8</character-encoding> to work. Yet another interesting question is why resin does not detect the Content-Type header which I provide in the handshake request and automatically adapt to UTF-8?
Additional Information
Attached Files

- Relationships

- Notes
(0005567)
ferg
10-20-11 09:59

The encoding for WebSocket is always utf-8. It has nothing to do with the character-encoding.

If you have a bit of sample code where Resin's text isn't producing utf-8, that would be helpful.
 
(0005568)
ihristov
10-20-11 13:15

I don't want to be picky nor impolite but "<character-encoding> specifies the default character encoding for the environment." sounds to me quite important and having a lot to do with servlets character encoding stuff. Anyway, the problem can be easily reproduced by the following recipe:

1.) Implement the WebSocketListener interface to provide a websocket server-side logic. Something like:
public void onReadText(WebSocketContext context, Reader is) {
System.out.println("Msg received: "+ org.apache.commons.io.IOUtils.toString(is));
}

2.) Develop a small client to connect to the WebServlet (Unitt framework can be useful here, http://code.google.com/p/unitt/) [^]

3.) Send some text messages and see what happens. Also verify that the Reader is using ISO-8859-1.

Suspiciously enough, I can see in the ReaderStream class a small note over the read method which goes like this:
// XXX: encoding issues

I hope that this is helpful to you and puts you on the right track.
I have to check if Resin is producing text using ISO-8859-1, so far I had no troubles with server -> client communication, thus I have not verified.

Cheers,
Ivan
 
(0005569)
ferg
10-20-11 14:10

server/1o36

The issue was with the read(char[]) call in WebSocketReader.

WebSocket encoding is required to be utf-8 by the WebSocket spec. It cannot be changed by any implementation. Because the WebSocket spec has a specific encoding, default character encoding does not apply.
 

- Issue History
Date Modified Username Field Change
10-20-11 05:19 ihristov New Issue
10-20-11 05:51 ihristov Issue Monitored: ihristov
10-20-11 09:59 ferg Note Added: 0005567
10-20-11 13:15 ihristov Note Added: 0005568
10-20-11 14:10 ferg Note Added: 0005569
10-20-11 14:10 ferg Assigned To  => ferg
10-20-11 14:10 ferg Status new => closed
10-20-11 14:10 ferg Resolution open => fixed
10-20-11 14:10 ferg Fixed in Version  => 4.0.24


Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
34 total queries executed.
28 unique queries executed.
Powered by Mantis Bugtracker