Mantis Bugtracker
  

Viewing Issue Advanced Details Jump to Notes ] View Simple ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0000309 [Resin] minor always 07-13-05 00:00 11-30-05 14:43
Reporter stefanp View Status public  
Assigned To
Priority immediate Resolution fixed Platform
Status closed   OS
Projection none   OS Version
ETA none Fixed in Version 2.1.x Product Version 2.1.x
  Product Build 2.1.x
Summary 0000309: ServletServer.restart(): bad synchronization leads to broken server
Description RSN-352
Hi,

This bug causes resin to end up in a unconfigured state if a class change is made while the server is under (even small) load.

The visible effect is that some vhosts cannot be found, and resin starts serving the contents of it's RESIN_HOME, and 404 errors for everything else.

To reproduce, start two simple shell scripts that request a page in a loop, and print ok or error depending on status (200/404). Cause a restart (in my case I change the .class file of a global resource). Prior to 2.1.14 (and in 2.1.14 without the problematic ServletServer patch), resin will lock for a bit, eventually serve a few 500 errors, but will end up serving the good files once the restart is complete. In 2.1.14, resin will start serving 404 errors.

(Some Vhost configuration seems to disappear in the process:
  [13:11:27.799] java.lang.NullPointerException
        at com.caucho.server.http.ServletServer.getHost(ServletServer.java:1362)
        at com.caucho.server.http.ServletServer.getHost(ServletServer.java:1339)
        at com.caucho.server.http.ServletServer.getInvocation(ServletServer.java:1244)
        at com.caucho.server.http.HttpRequest.handleRequest(HttpRequest.java:250)
        at com.caucho.server.http.HttpRequest.handleConnection(HttpRequest.java:170)
        at com.caucho.server.TcpConnection.run(TcpConnection.java:139)
        at java.lang.Thread.run(Thread.java:595)
)

Note that in our case, the problem is easy to reproduce since restarting takes a long time (approx 30s): 15 vhosts with 4 web apps each, lots of resources that need DB interaction to initialize in each vhost, etc... So the chance of having two requests enter restart() in parallel is higher than with a setup that'll restart more quickly.


This bug appeared in 2.1.14, and is due to the following change:

diff -r resin-2.1.13/src/com/caucho/server/http/ServletServer.java resin-2.1.14/src/com/caucho/server/http/ServletServer.java
1888c1888
< synchronized void restart()
---
> void restart()
1890,1891c1890,1892
< if (! isModifiedFull() || ! _isInitComplete || _isInitStarted)
< return;
---
> synchronized (this) {
> if (! isModifiedFull() || ! _isInitComplete || _isInitStarted)
> return;
1893,1895c1894,1897
< _isInitStarted = true;
< _isInitComplete = false;
< _configException = null;
---
> _isInitStarted = true;
> _isInitComplete = false;
> _configException = null;
> }

If the restart method is made synchronized again, the bug disappears.

Stefan
Steps To Reproduce
Additional Information Linux. Pretty complex configuration, lots of vhosts and webapps.
Attached Files

- Relationships

- Notes
(0000350)
ferg
07-13-05 00:00

Fixed in the snapshot. The proposed solution of setting the method as synchronized doesn't work because that reintroduces a deadlock.
 

- Issue History
Date Modified Username Field Change
07-13-05 00:00 stefanp New Issue
11-30-05 00:00 administrator Fixed in Version  => 2.1.x
11-30-05 14:43 ferg Status resolved => closed


Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
29 total queries executed.
27 unique queries executed.
Powered by Mantis Bugtracker