Mantis - Resin
Viewing Issue Advanced Details
3525 minor always 05-20-09 09:51 05-20-09 10:06
ferg  
ferg  
normal  
closed  
fixed  
none    
none 4.0.1  
0003525: UTF8Reader bom
(rep by Fiaz Hossain)

his time it seems to be related to UTF8Reader.java. The problem seems to be related to reading two BOM http://en.wikipedia.org/wiki/Byte-order_mark [^] in a row. The problem is that when the back to back sequence is present in the data stream the Form class will get in IOException and abandon further processing. Here is a fix that seems to work. Can you let me know if the change is acceptable and if so can you merge into the tip of the source.

Fix:
==== //tools/Linux/resin/resin-pro-3.1.6/lib-src/patches/com/caucho/vfs/i18n/UTF8Reader.java#1 - /home/fiaz/dev/tools/Linux/resin/resin-pro-3.1.6/lib-src/patches/com/caucho/vfs/i18n/UTF8Reader.java ====
121c121
< return is.read();
---
> return read();


Test case:

package shared.util;

import java.io.*;

import com.caucho.vfs.i18n.UTF8Reader;

public class BOMTest {

    /**
     * @param args
     */
    public static void main(String[] args) {
    // BOM is ef, bb, bf or 239, 187, 191
        byte[] BOM1 = {(byte)0xef, (byte)0xbb, (byte)0xbf, 36};
        byte[] BOM2 = {36, (byte)0xef, (byte)0xbb, (byte)0xbf};
        byte[] BOM3 = {36, (byte)0xef, (byte)0xbb, (byte)0xbf, 36};
        byte[] BOMBOM = {36, (byte)0xef, (byte)0xbb, (byte)0xbf, (byte)0xef, (byte)0xbb, (byte)0xbf, 36};
        
        readData(BOM1);
        readData(BOM2);
        readData(BOM3);
        readData(BOMBOM);
    }

    private static void readData(byte[] bom) {
        UTF8Reader reader = new UTF8Reader();
        Reader r = reader.create(new ByteArrayInputStream(bom), "UTF-8");
        int ch;
        try {
            while ((ch = r.read()) > 0) {
                System.out.println("Got char:" + ch);
        }
        } catch (IOException e) {
            System.out.println("Caught exception:" + e.getMessage());
        }
    }
}


Notes
(0004027)
ferg   
05-20-09 10:06   
server/1m00