Mantis Bugtracker
  

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0006034 [Resin] minor always 01-31-17 14:30 03-03-17 13:07
Reporter ferg View Status public  
Assigned To ferg
Priority normal Resolution fixed  
Status closed   Product Version 4.0.49
Summary 0006034: issue with database corruption
Description (rep by Nikola Sepentulevski)
We have noticed resin randomly crashing and sessions dropping.

 

We run resin in a docker container which is clusters between 3 hosts.

 

Attached (console-host-3.private.net.log.13012017.tar) is a stack trace in our logs that we could see, unfortunately we miss out on raw dumps because when resin crashes the docker container is rebuilt from an image. We have syslog configured which sends the logs to a different server. At the time of the issue, memory on that server was at 75%. Attached is a graph which shows the memory usage around the time of the crash (capture.JPG)

 

A couple of hours later we noticed the following stack trace on one of our different hosts console-host-3:

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: CONFIG: [17-01-13 13:14:48.958] {resin-58013} java.lang.IllegalStateException: Table[data:3,/resin/resin-data/console-host-2.private.net/distcache/data.db]:

inode block 0x3827e000 len=8192 has invalid block code (0) expected (2)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.blob.Inode.corrupted(Inode.java:1270)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.blob.Inode.validateBlockAddr(Inode.java:946)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.blob.Inode.remove(Inode.java:865)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.table.BlobColumn.deleteData(BlobColumn.java:221)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.table.Table.delete(Table.java:1163)

Jan 13 13:14:48 console-host-2.private.et dockerd[1559]: at com.caucho.db.table.TableIterator.delete(TableIterator.java:512)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.sql.DeleteQuery.execute(DeleteQuery.java:87)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.jdbc.PreparedStatementImpl.execute(PreparedStatementImpl.java:367)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.jdbc.PreparedStatementImpl.executeUpdate(PreparedStatementImpl.java:337)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.server.distcache.DataStore$DeleteAlarm.deleteOrphans(DataStore.java:731)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.server.distcache.DataStore$DeleteAlarm.handleAlarm(DataStore.java:685)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.util.Alarm.handleAlarm(Alarm.java:523)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.util.Alarm.run(Alarm.java:495)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.env.thread2.ResinThread2.runTasks(ResinThread2.java:173)

Jan 13 13:14:48 console-host-2.private.t dockerd[1559]: at com.caucho.env.thread2.ResinThread2.run(ResinThread2.java:118)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: java.lang.IllegalStateException: Table[data:3,/resin/resin-data/console-host-2.private.netregistry.net/distcache/data.db]: inode block 0x3827e000 len=8192 has invalid block code (0) expected (2)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.blob.Inode.corrupted(Inode.java:1270)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.blob.Inode.validateBlockAddr(Inode.java:946)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.blob.Inode.remove(Inode.java:865)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.table.BlobColumn.deleteData(BlobColumn.java:221)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.table.Table.delete(Table.java:1163)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.table.TableIterator.delete(TableIterator.java:512)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.sql.DeleteQuery.execute(DeleteQuery.java:87)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.jdbc.PreparedStatementImpl.execute(PreparedStatementImpl.java:367)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.db.jdbc.PreparedStatementImpl.executeUpdate(PreparedStatementImpl.java:337)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.server.distcache.DataStore$DeleteAlarm.deleteOrphans(DataStore.java:731)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.server.distcache.DataStore$DeleteAlarm.handleAlarm(DataStore.java:685)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.util.Alarm.handleAlarm(Alarm.java:523)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.util.Alarm.run(Alarm.java:495)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.env.thread2.ResinThread2.runTasks(ResinThread2.java:173)

Jan 13 13:14:48 console-host-2.private.net dockerd[1559]: at com.caucho.env.thread2.ResinThread2.run(ResinThread2.java:118)

 

Stack trace attached as “console-host-2.private.netregistry.net.log.13012017.tar.gz” and screen shot of the memory usage at the time attached as “console-host-2.JPG”

 

We can see this happen on a regular basis. We are running the latest release of Resin.

 

Please let me know if there are further details required.

Additional Information
Attached Files

- Relationships

- Notes
(0006739)
ferg
02-06-17 12:06

[Details]
        Below is the error output in "jvm-app-0.log" starts from around 20:55 on 1st Feb. 2017.

        ---
        Table[log_data:7,c:\app\resin-pro-4.0.48\resin-data\app-0\log\log_data.db]: inode block len=34000 has non-zero offset 0x726573696e3a7479
        java.lang.IllegalStateException: Table[log_data:7,c:\app\resin-pro-4.0.48\resin-data\app-0\log\log_data.db]: inode block len=34000 has non-zero offset 0x726573696e3a7479
        [17-02-01 20:55:09.777] {resin-3619} WarningService: Shutdown: caucho.database.corrupt
        ---

        After this log output, resin restarted itself.

        We presume it restarted since the corruption of "resin-data" occured.

        Also "resin-data" keeps corrupting on next day and the day after, too.

        ---
        [17-02-02 20:01:09.557] {resin-7573} Table[log_data:7,c:\app\resin-pro-4.0.48\resin-data\app-0\log\log_data.db]: inode block len=48000 has non-zero offset 0x202d205468652064
        [17-02-03 22:07:09.646] {resin-12769} Table[log_data:7,c:\app\resin-pro-4.0.48\resin-data\app-0\log\log_data.db]: inode block len=52000 has non-zero offset 0x66696c65222c0a20
 

- Issue History
Date Modified Username Field Change
01-31-17 14:30 ferg New Issue
02-06-17 12:06 ferg Note Added: 0006739
03-03-17 13:07 ferg Assigned To  => ferg
03-03-17 13:07 ferg Status new => closed
03-03-17 13:07 ferg Resolution open => fixed
03-03-17 13:07 ferg Fixed in Version  => 4.0.51


Mantis 1.0.0rc3[^]
Copyright © 2000 - 2005 Mantis Group
28 total queries executed.
25 unique queries executed.
Powered by Mantis Bugtracker