Mantis - Resin
Viewing Issue Advanced Details
5091 minor always 05-24-12 12:17 06-15-12 12:06
rickHigh  
ferg  
normal  
closed  
fixed  
none    
none 4.0.29  
0005091: Dynamic join for app server fails for 4.0.28 snapshot
This could be a config problem.

I have a working, clustered triad. (It is tested via remote cluster deploys, and resin-admin connectivity).

I attempt to add fourth server which is a spoke server.

The working Triad uses this user-data.

home_server : app-0
elastic_cloud_enable : true
home_cluster : app
https : 8443
admin_user : admin
admin_password : {SSHA}WITH_HELD
app_servers : ext:23.21.106.227 ext:23.21.121.216 ext:23.21.195.83
cluster_system_key : changeme890
web_admin_enable : true
remote_cli_enable : true
web_admin_external : true

Each triad member has the same config except the home-server changes.

The spoke server has this config.

elastic_cloud_enable : true
home_cluster : app
https : 8443
admin_user : admin
admin_password : {SSHA}WITH_HELD
app_servers : ext:23.21.106.227 ext:23.21.121.216 ext:23.21.195.83
cluster_system_key : changeme890

The dynamic server does not fully join the cloud.
The triad members know about it, but can't communicate with it.
You can see the dynamic server in the drop down list in the resin-admin tool, but if you try to open it, you get a message that it cannot connect.
(The triad members are sharing deployments and allowing access to each other via the resin-admin).

It appears from looking at the log file of the dynamic spoke server that it is not getting the internal ip addresses of the triad but instead getting bogus 127.0.0.2 loop back ips.

Log snippet...



[12-05-24 18:40:19.270] {main} server listening to ip-10-84-197-111.ec2.internal:6830
[12-05-24 18:40:19.282] {main}
[12-05-24 18:40:24.543] {main} Repository cannot set RepositoryRoot[, 0]
[12-05-24 18:40:24.823] {main}
[12-05-24 18:40:24.823] {main} resin.home = /usr/local/share/resin-4.0.s120520/
[12-05-24 18:40:24.824] {main} resin.root = /var/www
[12-05-24 18:40:24.824] {main} resin.conf = /etc/resin/resin.xml
[12-05-24 18:40:24.825] {main}
[12-05-24 18:40:24.825] {main} server = 10.84.197.111:6830 (app:dyn-10.84.197.111:6830)
[12-05-24 18:40:24.827] {main} stage = production
[12-05-24 18:40:25.879] {main} WebApp[production/webapp/default/resin-admin] active
[12-05-24 18:40:25.926] {main} WebApp[production/webapp/default/ROOT] active
[12-05-24 18:40:26.775] {main} WebApp[production/webapp/default/resin-doc] active
[12-05-24 18:40:26.776] {main} Host[production/host/default] active
[12-05-24 18:40:26.777] {main} ProServer[id=dyn-10.84.197.111:6830,cluster=app] active
[12-05-24 18:40:26.778] {main} JNI: file, async keepalive (max=130816), socket
[12-05-24 18:40:26.786] {main}
[12-05-24 18:40:26.786] {main}
[12-05-24 18:40:26.787] {main} http listening to *:8080
[12-05-24 18:40:27.349] {main} https listening to *:8443
[12-05-24 18:40:27.363] {main}
[12-05-24 18:40:27.391] {main} HeartbeatHealthCheck[WARNING:no active heartbeat from ClusterServer[id=app-0,127.0.0.2:6800], no active heartbeat from ClusterServer[id=app-1,127.0.0.2:6800], no active heartbeat from ClusterServer[id=app-2,127.0.0.2:6800]]
[12-05-24 18:40:27.409] {main} ProResin[id=dyn-10.84.197.111:6830] started in 12586ms
[12-05-24 18:40:27.424] {resin-40} DumpJmx[] OK: JMX dump scheduled with a 60000 ms delay
[12-05-24 18:45:27.405] {resin-37} HeartbeatHealthCheck[WARNING:no active heartbeat from ClusterServer[id=app-0,127.0.0.2:6800], no active heartbeat from ClusterServer[id=app-1,127.0.0.2:6800], no active heartbeat from ClusterServer[id=app-2,127.0.0.2:6800]]


Note the expectation of the heart beat from

app-0 127.0.0.2:6800
app-1 127.0.0.2:6800
app-2 127.0.0.2:6800

HeartbeatHealthCheck[WARNING:no active heartbeat from ClusterServer[id=app-0,127.0.0.2:6800], no active heartbeat from ClusterServer[id=app-1,127.0.0.2:6800], no active heartbeat from ClusterServer[id=app-2,127.0.0.2:6800]]

It should instead expect heartbeats from the internal ip address of the triad members.


Notes
(0005856)
ferg   
06-15-12 12:06   
See 0005015