Anonymous | Login | Signup for a new account | 09-09-2024 18:37 PDT |
Main | My View | View Issues | Change Log | Docs |
Viewing Issue Simple Details [ Jump to Notes ] | [ View Advanced ] [ Issue History ] [ Print ] | |||||||||||
ID | Category | Severity | Reproducibility | Date Submitted | Last Update | |||||||
0004430 | [Hessian] | major | always | 03-09-11 03:45 | 04-05-11 07:28 | |||||||
Reporter | matthias-meier | View Status | public | |||||||||
Assigned To | ||||||||||||
Priority | normal | Resolution | open | |||||||||
Status | new | Product Version | 4.0.7 | |||||||||
Summary | 0004430: IdentityIntMap.resize(int) does not take replaced objects into account | |||||||||||
Description |
We stumpled upon a bug in IdentityIntMap when serializing and then deserializing an object stream containing an unmodifiable Set (created with java.util.Collections.unmodifiableSet(...)) using the Hessian 2 protocol. What we got upon deserializing was: "com.caucho.hessian.client.HessianRuntimeException: com.caucho.hessian.io.HessianProtocolException: '&65535;' is an unknown code". However, some debugging showed, that the problem was actually on the serializing side. This uses the helper class IdentityIntMap (mapping objects to ints) to manage references to already serialized objects. One feature is obviously that objects therein can be "replaced" by some other object while they get serialized. (Obviously that happens for example with unmodifiable Sets.) When an object gets replaced in the IdentityIntMap, the entry for that object is not really removed, but rather its value is set to -1. However, the size of the map is reduced by 1, so the replaced object does not "count" anymore when determining the size. (Note that the size of the map is quite important because it is used to determine the reference values for objects which get newly inserted into the map.) When the IdentityIntMap is getting "full", it will resize itself. The problem with resizing is now that the resize algorithm does not care for map entries with values of -1. It will simply rehash each and every entry, no matter what its value is, and then use the number of rehashed entries (including all with value -1) to determine the "new" size. Therefore, if the map contains (for example) two entries which have the value -1 and and 8 other entries, it's size should be 8. If I add a ninth entry now and this triggers resizing, from then on the map will claim its size to be 11 instead of 9 as it would have been expected. Subsequently adding more objects will result in reference values starting from 11. That is, if I add the tenth object (which should actually get the reference value 9) it will be mapped to the reference value 11 and so on. This leads to "holes" in the reference-values used by the serializer. The deserializer on the other hand does not know anything about IdentityIntMap, entries with replaced objects and resizing the map. It simply numbers the objects it receives sequentially, starting from zero. The result is that for all object references which are greater than the point at which resizing happened, the deserializer will reference the wrong objects. In the example above, if the deserializer receives a reference of 11 from the serializer, that reference is actually meant to point to the tenth element. But the deserializer does not know that, and it will therefore not find the tenth element (which should have reference 9 as seen by the deserializer) but rather the 12th (i.e. reference 11). I wrote a small sample program to demonstrate the problem (see uploaded file "HessianIdentityIntMapBug.java"). The program "simulates" the process of serialization and prints out whats going on (including the resulting map in the end). This example uses an initial map capacity of 8, so the resizing will be quickly triggered to show the point. The output of the program looks like this: ----------- ==== Serializing Objects ... ==== [Object-00]: has been written to the output stream [Object-01]: has been written to the output stream [Object-02]: has been written to the output stream [Object-03]: has been replaced with: Replacement for [Object-03] [Object-04]: has been written to the output stream [Object-05]: has been replaced with: Replacement for [Object-05] [Object-06]: has been written to the output stream [Object-07]: has been written to the output stream [Object-08]: has been written to the output stream [Object-09]: has been written to the output stream [Object-10]: has been written to the output stream [Object-11]: has been written to the output stream [Object-06]: has already been serialized; writing reference to 0000006 instead of serializing again [Object-09]: has already been serialized; writing reference to 0000011 instead of serializing again [Object-02]: has already been serialized; writing reference to 0000002 instead of serializing again [Object-08]: has already been serialized; writing reference to 0000010 instead of serializing again ==== IdentityIntMap contents: ==== -1: Object-03 -1: Object-05 00: Object-00 01: Object-01 02: Object-02 03: Replacement for [Object-03] 04: Object-04 05: Replacement for [Object-05] 06: Object-06 07: Object-07 10: Object-08 11: Object-09 12: Object-10 13: Object-11 ----------- The "hole" in the references is easy to spot in the final IdentityIntMap contents: There is no object which is mapped to reference 8 or 9. "Object-08" has reference 10 instead of 8. (This is the point where the map got resized.) It's quite easy now to imagine what the deserializer will do. It will receive the serialized objects and build a map of references to objects which will look like the following: 00: Object-00 01: Object-01 02: Object-02 03: Replacement for [Object-03] 04: Object-04 05: Replacement for [Object-05] 06: Object-06 07: Object-07 08: Object-08 09: Object-09 10: Object-10 11: Object-11 Note the "shifted" references for "Object-08" through "Object-11"! If the deserializer now receives the references 0000006, 0000012, 0000002 and 0000010 it will resolve these to "Object-06", "Object-11", "Object-2" and "Object-10". However, actually the serializer meant to send "Object-06", "Object-09", "Object-02" and "Object-08"! So this is where our application crashed with the mentionend exception. As a workaround we were able to change our code, so that the unmodifiable Set is not unmodifiable anymore but a simple, plain HashSet. It seems that for these no "replacement" takes place and therefore (de)serialization works now for us in that specific case. But we use Hessian a lot in our application, and it would be a real pain in the ass to make sure that we never transfer any unmodifiable collections. :-( I think the bug should be quite easy to fix, so it would be very nice, if some future version of the Hessian 2 implementation would be able to correctly resize its IdentityIntMap, so we are able to transfer unmodifiable Sets in the future. ;-) BTW: I see in "Product Version" that there are newer versions than 4.0.7. So maybe the bug is already fixed in one of these? However, I could not find an existing bug report which would match this issue and and neither could I find these newer versions for download. So I could not test if the bug has already been fixed in one of these or not. (We simply used the newest version from <http://hessian.caucho.com/index.xtp#Java> [^] which seems to be 4.0.7 at the moment.) |
|||||||||||
Additional Information | ||||||||||||
Attached Files | HessianIdentityIntMapBug.java [^] (5,155 bytes) 03-09-11 03:45 | |||||||||||
|
Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
38 total queries executed. 33 unique queries executed. |