Mantis - Resin
Viewing Issue Advanced Details
1766 minor always 05-29-07 09:43 06-21-07 13:03
ferg  
ferg  
normal  
closed 3.1.1  
fixed  
none    
none 3.1.2  
0001766: Load-Balancer detection of frozen server
(rep by Kevin MacClay)

We are experiencing a problem where the Resin front end load balancer is failing to detect that one of the back-end Resin instances is not processing requests. The problem is that the back-end Resin instance is accepting connections but does not actually process the request. Since the front-end LB does not mark the back-end instance down, it continues directing traffic there. Users who are directed to the ?bad? instance are frozen up.
 
We are currently using Resin 3.0.14 for this application. I noticed Resin 3.0.20 has a ?client-connect-timeout? enhancement to the LB. But I?m not sure that this would help either since a connection can be established to the back-end instance, it just hangs after that.
 
In addition, we would also like to create a Nagios monitoring script that would be able to notify us if a back-end instance enters this state. Do you have any recommendations on how to test whether a back-end ?srun? instance is accepting _and handling_ requests?
 

There are no notes attached to this issue.