Mantis Bugtracker
  

Viewing Issue Advanced Details Jump to Notes ] View Simple ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0005286 [Resin] block random 11-21-12 02:33 01-09-13 12:10
Reporter paul View Status public  
Assigned To ferg
Priority normal Resolution fixed Platform
Status closed   OS
Projection none   OS Version
ETA none Fixed in Version 4.0.32 Product Version 3.1.9
  Product Build
Summary 0005286: Threads can become 'stuck' forever
Description Noticed a few times that the thread count on a server (in a 3 server cluster) can become 'stuck'. Looking through the resin.log reveals the error message

[2012-11-09 04:56:41.419] java.io.IOException: failed to add EPOLL for fd=512
[2012-11-09 04:56:41.419]
[2012-11-09 04:56:41.419] at com.caucho.server.port.JniSelectManager.removeNative(Native Method)
[2012-11-09 04:56:41.419] at com.caucho.server.port.JniSelectManager.remove(JniSelectManager.java:476)
[2012-11-09 04:56:41.419] at com.caucho.server.port.JniSelectManager.run(JniSelectManager.java:376)
[2012-11-09 04:56:41.419] at java.lang.Thread.run(Thread.java:662)


Taking a Threaddump reveals a lot of threads (up to 200 sometimes) all with a stacktrace similar to

"hmux-192.168.0.3:6800-20399$1859226681" - Thread t@40
   java.lang.Thread.State: RUNNABLE
    at com.caucho.server.port.JniSelectManager.addNative(Native Method)
    at com.caucho.server.port.JniSelectManager.keepalive(JniSelectManager.java:229)
    at com.caucho.server.port.TcpConnection.keepalive(TcpConnection.java:448)
    at com.caucho.server.port.TcpConnection.run(TcpConnection.java:739)
    at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:743)
    at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:662)
    at java.lang.Thread.run(Thread.java:662)

   Locked ownable synchronizers:
    - None

Looking at the native code, guessing that the failure causes the method to exit without first releasing the mutex that it took, and thus leaves any subsequent callers stuck waiting for the same mutex to become available.
Steps To Reproduce
Additional Information Checked the JNI src for 3.1.9Pro, 3.1.12Pro and 3.1.13Pro and all seem to be the same.
Attached Files

- Relationships

There are no notes attached to this issue.

- Issue History
Date Modified Username Field Change
11-21-12 02:33 paul New Issue
01-09-13 12:10 ferg Assigned To  => ferg
01-09-13 12:10 ferg Status new => closed
01-09-13 12:10 ferg Resolution open => fixed
01-09-13 12:10 ferg Fixed in Version  => 4.0.32


Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
27 total queries executed.
25 unique queries executed.
Powered by Mantis Bugtracker