|Anonymous | Login | Signup for a new account||03-08-2021 04:18 PST|
|Main | My View | View Issues | Change Log | Docs|
|Viewing Issue Simple Details [ Jump to Notes ]||[ View Advanced ] [ Issue History ] [ Print ]|
|ID||Category||Severity||Reproducibility||Date Submitted||Last Update|
|0004708||[Resin]||major||random||08-11-11 05:00||02-27-13 16:15|
|Summary||0004708: sticky session|
we are using mod_caucho on debian lenny (amd64) with apache2.2 latetly we have recognized that our sessions appear to not be sticky to one of our resin instances anymore, we tested first with 4.0.18, and now with 4.0.20, but the behaviour didn't change, still from time to time the session is switched to the other resin instance. with no obvious reason.
how do i get more debugging info here?
Is there else anything we can provide? As a long-time paying customer, we are quite frustrated with our production system being extremely fragile.# due to random server switches.
session replication as a workaround is impractical due to different physical locations.
When filing a bug as a paying customer, please also send a mail to the support address so we can increase the priority. Otherwise the bug report looks like an open-source report.
The mod_caucho should only failover if it cannot connect to the backend Resin instance. (The mod_caucho code hasn't changed in a long while.)
Can you send the server and load-balance timeout parameters? mod_caucho reads those from the backend server to see what values to use for a timeout.
Checking the code, the key parameters are
mod_caucho's view of the values should be displayed in /caucho-status
|Also, the /resin-admin graphs (in the "meters" tab of the summary) might show unusual netstat behavior or other glitches like a memory spike.|
|As a test, you might try lowering load-balance-idle-time to 30s instead of the default 60s. (And check the netstat history). That would test the possibility of a timing issue without affecting performance much.|
thanks for the hints, scott.
config everywhere is
connect timeout : 5
idle time: 60
recover : 15
socket timeout : 600
frontend apache2 has
no keepalives defined in resin.
we can reproduce the behavior internally, so it does not look like being connected to high load.
If you can reproduce it in the lab, can you set the logging to "fine" or "finer" on both backend Resin instances, and mail the jvm-default logs?
BTW, it's the Resin keepalive that matters (because this is the mod_caucho to Resin link). The Apache one doesn't matter.
The http://caucho.com/resin-4.0/admin/clustering-overview.xtp [^] page has a diagram showing the load balancing timings.
Also, the JMX for the resin:type=Port,name=XXX-6800 will show the SocketTimeout and KeepaliveTimeout.
Our load testing wasn't able to show any problems, though.
|08-11-11 05:00||georgbuschbeck||New Issue|
|08-18-11 13:34||uweschaefer_||Issue Monitored: uweschaefer_|
|08-18-11 13:37||uweschaefer_||Note Added: 0005449|
|08-18-11 16:14||ferg||Note Added: 0005450|
|08-18-11 16:36||ferg||Note Added: 0005451|
|08-18-11 17:25||ferg||Note Added: 0005452|
|08-18-11 18:37||ferg||Note Added: 0005453|
|08-19-11 04:39||uweschaefer_||Note Added: 0005454|
|08-19-11 09:04||ferg||Note Added: 0005455|
|09-23-11 01:08||amukas||Issue Monitored: amukas|
|02-27-13 16:15||alex||Status||new => assigned|
|02-27-13 16:15||alex||Assigned To||=> alex|
|02-27-13 16:15||alex||Status||assigned => closed|
|02-27-13 16:15||alex||Note Added: 0006204|
|02-27-13 16:15||alex||Resolution||open => fixed|
|02-27-13 16:15||alex||Fixed in Version||=> 4.0.36|
| Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
48 total queries executed.|
36 unique queries executed.