Mantis - Resin
Viewing Issue Advanced Details
4818 major always 10-20-11 05:19 10-20-11 14:10
ihristov  
ferg  
normal  
closed 4.0.23  
fixed  
none    
none 4.0.24  
0004818: WebSocket character encoding problem
Resin default character encoding is ISO-8859-1, however the encoding for a text data frame from the WebSocket protocol is UTF-8. I've tried using <character-encoding>utf-8</character-encoding> under <resin> and <web-app> tags (as prescribed by the documentation http://www.caucho.com/resin-4.0/admin/config-el-ref.xtp#characterencoding) [^] unfortunately neither of these actions seems to have the desired effect. I am doing this in a test environment using resin embedded (WebAppEmbed). Nevertheless I would expect the <character-encoding>utf-8</character-encoding> to work. Yet another interesting question is why resin does not detect the Content-Type header which I provide in the handshake request and automatically adapt to UTF-8?

Notes
(0005567)
ferg   
10-20-11 09:59   
The encoding for WebSocket is always utf-8. It has nothing to do with the character-encoding.

If you have a bit of sample code where Resin's text isn't producing utf-8, that would be helpful.
(0005568)
ihristov   
10-20-11 13:15   
I don't want to be picky nor impolite but "<character-encoding> specifies the default character encoding for the environment." sounds to me quite important and having a lot to do with servlets character encoding stuff. Anyway, the problem can be easily reproduced by the following recipe:

1.) Implement the WebSocketListener interface to provide a websocket server-side logic. Something like:
public void onReadText(WebSocketContext context, Reader is) {
System.out.println("Msg received: "+ org.apache.commons.io.IOUtils.toString(is));
}

2.) Develop a small client to connect to the WebServlet (Unitt framework can be useful here, http://code.google.com/p/unitt/) [^]

3.) Send some text messages and see what happens. Also verify that the Reader is using ISO-8859-1.

Suspiciously enough, I can see in the ReaderStream class a small note over the read method which goes like this:
// XXX: encoding issues

I hope that this is helpful to you and puts you on the right track.
I have to check if Resin is producing text using ISO-8859-1, so far I had no troubles with server -> client communication, thus I have not verified.

Cheers,
Ivan
(0005569)
ferg   
10-20-11 14:10   
server/1o36

The issue was with the read(char[]) call in WebSocketReader.

WebSocket encoding is required to be utf-8 by the WebSocket spec. It cannot be changed by any implementation. Because the WebSocket spec has a specific encoding, default character encoding does not apply.