Mantis Bugtracker

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0003528 [Resin] minor always 05-21-09 18:23 06-08-09 19:09
Reporter ferg View Status public  
Assigned To ferg
Priority normal Resolution fixed  
Status closed   Product Version 3.1.9
Summary 0003528: watchdog/httpd issues
Description (rep by Rob Lockstone)

It looks like this is another manifestation of the same issue with the watchdog process not properly shutting things down prior to a new httpd.exe process starting up. I've been playing around with this, watching the windows task manager on the machine running resin as I remotely (although it could be done locally as well) stop/query/start the resin windows service using the service controller (sc) command.

It looks like the problem is that the windows service only cares about the httpd.exe process, but the watchdog process is embedded within java itself and runs within the javaw process. There doesn't seem to be any communication between httpd and java during shutdowns, although interestingly, there is a java.exe process that gets spawned whenever httpd.exe is started/stopped, but that process is very short-lived.

Here's what I see happening:

1. sc \\machine stop resin

The httpd.exe process exits pretty quickly, within a second or two.

2. sc \\machine query resin

Once the httpd.exe process exits, the query returns "STOPPED" because, as far as the service controller is concerned, the service is stopped because all it's tied to the httpd.exe process.

3. Meanwhile, depending on what java is doing, the javaw.exe process (actually two of them, since one is the resin-admin process) continues to run. In the case of the Thread.sleep(300000); jsp page, the javaw.exe processes continue to run for up to 30 seconds before they finally disappear.

4. If the sc \\machine start resin command is issued while the javaw.exe processes are still running, the new httpd.exe process will start but won't be able to actually start java due to the watchdog IllegalStateException, as noted here in the watchdog-manager.log:

[2009/05/21 16:15:49.182] java.lang.IllegalStateException: Can't start new task because of old task 'WatchdogTask[Watchdog[]]'
[2009/05/21 16:15:49.182] at com.caucho.boot.Watchdog.start(
[2009/05/21 16:15:49.182] at com.caucho.boot.WatchdogManager.startServer(
[2009/05/21 16:15:49.182] at com.caucho.boot.WatchdogServlet.start(
[2009/05/21 16:15:49.182] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2009/05/21 16:15:49.182] at sun.reflect.NativeMethodAccessorImpl.invoke(
[2009/05/21 16:15:49.182] at sun.reflect.DelegatingMethodAccessorImpl.invoke(
[2009/05/21 16:15:49.182] at java.lang.reflect.Method.invoke(
[2009/05/21 16:15:49.182] at com.caucho.hessian.server.HessianSkeleton.invoke(
[2009/05/21 16:15:49.182] at com.caucho.hessian.server.HessianSkeleton.invoke(
[2009/05/21 16:15:49.182] at com.caucho.hessian.server.HessianServlet.service(
[2009/05/21 16:15:49.182] at com.caucho.server.dispatch.ServletFilterChain.doFilter(
[2009/05/21 16:15:49.182] at com.caucho.server.webapp.WebAppFilterChain.doFilter(
[2009/05/21 16:15:49.182] at com.caucho.server.dispatch.ServletInvocation.service(
[2009/05/21 16:15:49.182] at com.caucho.server.hmux.HmuxRequest.handleRequest(
[2009/05/21 16:15:49.182] at
[2009/05/21 16:15:49.182] at com.caucho.util.ThreadPool$Item.runTasks(
[2009/05/21 16:15:49.182] at com.caucho.util.ThreadPool$
[2009/05/21 16:15:49.182] at


If I set the shutdown-max-wait timeout to a very low value, say 3s, then my experiments show that the javaw.exe processes do exit within < 5 seconds of the httpd.exe process going away. So I'm going to build in a delay of 10s in my shutdown/restart routine. Can you tell me if there are any inherent problems in making the shutdown time such a low value?

I think the real solution has to involve setting up some kind of communication channel between httpd.exe and the watchdog so that httpd.exe doesn't exit until the javaw process(es) really exit. As it stands now, httpd.exe seems to be completely disconnected from the running javaw processes. Yes, the watchdog does eventually stop java from running, but httpd is allowed to exit independent of the watchdog.

Additional Information
Attached Files

- Relationships

- Notes
06-08-09 19:09

changed Resin/Watchdog communication to use BAM/HMTP service. Now the watchdog will send a stop message to Resin and wait for the result.

- Issue History
Date Modified Username Field Change
05-21-09 18:23 ferg New Issue
06-08-09 19:09 ferg Note Added: 0004067
06-08-09 19:09 ferg Assigned To  => ferg
06-08-09 19:09 ferg Status new => closed
06-08-09 19:09 ferg Resolution open => fixed
06-08-09 19:09 ferg Fixed in Version  => 4.0.1

Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
28 total queries executed.
25 unique queries executed.
Powered by Mantis Bugtracker