Mantis - Resin
Viewing Issue Advanced Details
2368 crash always 01-25-08 04:44 06-11-08 18:53
resinossi  
ferg  
normal  
closed 3.0.24  
fixed  
none    
none 3.2.0  
0002368: Whenever heapdump is taken, resin cluster breaks down
Whenever we take a heapdump with command
jdk1.5.0_14/bin/jmap -heap:format=b PID

whole resin cluster (2-6 instances) becomes unresponsive and is unusable.
Quite often all resins in cluster needs to be restarted to make the site to serve again.

If Apaches with mod_caucho are restarted, the site works for some minutes, but breaks again.

Also amount of active threads increases hugely in each resin instance while taking the heapdump.

Notes
(0002924)
ferg   
03-27-08 10:17   
This is a difficult issue.

A heap dump freezes the JVM for the time that it takes to take the heap dump. If the heap is large, this can be a significant period of time.

For the cluster issue, we may need to change the messaging/threading behavior for writing backup copies. Currently, Resin uses the request thread to write the session backups, which can cause problem when a backup server goes down. Any updates to that backup model will need to wait for the 3.2.x branch (after 3.1.6)
(0003177)
ferg   
06-11-08 18:53   
In 3.2, the sessions are now stored back using a separate thread, so even if a backend server is unresponsive, the primary servers will not need to wait.