Mantis - Resin
Viewing Issue Advanced Details
2410 minor always 02-07-08 07:24 02-13-08 09:35
closed 3.1.4  
none 3.1.5  
0002410: watchdog issues
(rep by Daniel Lopez)

Ever since I migrated from Resin 3.0 to 3.1 and watchdog process, I'm
having different issues, that might be caused by my setup, but they are
not consistent, hence my question here.

I have 6 different Resin instances in my app. server, that I use to
distribute the different applications so maintenance in an instance does
not "bother" the applications in another instance. Up to 3.0, that meant
basically having different resin.conf files and pointing the startup
script to the appropriate file, so far so good.

However, since the upgrade to 3.1, it seems that when you start up the
first instance, one watchdog process is created and then "handles" all
the different instances. If you start a different instance, even if you
use a different port for the watchdog process, it "connects" to the
already created process and is now being handled by THE ONE.
It would not be a real problem if it worked fine, but sometimes I get
messages when trying to stop/start an instance independently, that the
instance is "already running", even if it is not as the port is not
responding and the process can be confirmed being killed by the console.
Sometimes stopping an instance means killing its process, as they
sometimes do not want to die on their own, but this used to work fine
until 3.0.

The only solution in those cases seems to be killing all instances AND
the watchdog process and start them all over again, which defeats the
whole purpose of having separate instances.

02-12-08 08:04   
The situation when it would happen goes like this: A database has gone
wild and causes some web apps in a JVM to be stuck, I try to stop the
resin instance but it doesn't, as I can see the O.S. process still lying
there. I kill it hard with kill -9 and the O.S. process is gone. After
that, I try to start the instance and the watchdog process refuses
saying that 'server X is already running'. After that I am unable to
start the server unless I kill all the instances and the watchdog and
start them all over again.
02-13-08 09:35   
Refactor of watchdog so "kill" avoids the watchdog instance stuck state.