0002707: mysql/mediawiki transaction timeout

Viewing Issue Simple Details [ Jump to Notes ]

[ View Advanced ] [ Issue History ] [ Print ]

Category

Severity

Reproducibility

Date Submitted

Last Update

0002707

[Quercus]

minor

always

05-29-08 09:08

05-29-08 09:22

Reporter

ferg

View Status

public

Assigned To

Priority

normal

Resolution

open

Status

new

Product Version

3.1.6

Summary

0002707: mysql/mediawiki transaction timeout

Description

(rep by Paul Fischer)

We keep running into sporadic database errors on certain mediawiki pages. Mostly they are mySQL 1213 errors (a database lock) or a database timeout. It seems to be related to some object caching in mediawiki, and appears to occur only during the deletion of data related to objcache.

I have a few theories on what might cause this:

1. Clustering between multiple mediawiki instances is causing deadlocks as two instances try to delete the same content
2. database configuration issue
3. Logging synchronization (when this issue seems to occur, I look at resin-admin to see what is going on. Often, there seem to be a lot of threads in a BLOCKED state waiting on logging code. I am wondering if there is some synchronization that is causing threads to block [during logging], and that this waiting is having a cascade effect on the database connection pool [since db connections can't be returned])

Even if this issue continued to occur, we could prevent it from getting seen by always passing a 500 error. The problem is that we have a controller that delegates to Quercus/PHP, and even if a 500 error is returned, the resultant response is simply included within the model and then displayed in a section of a page. In other words, each page is comprised of multiple requests to PHP (via a controller) making it hard to detect an error condition. If we were able to detect a 500 error on any of these "embedded" requests, we could ensure that a 500 is sent for the actual browser request. Since all requests come through Akamai, we these bad responses would never get seen -- Akamai will never cache or display 500 errors. It will just use the last, cached, non-error page.

If you have any suggestions on how to go about detecting an error condition on one of these responses, it would be very helpful. And of course, addressing the actual issue is the most ideal. The problem happens sporadically, making it very difficult to debug. But since these pages are getting cached, the error is compounded, and it looks quite bad on the site.

Additional Information

Attached Files

Relationships

Notes
(0003109) ferg 05-29-08 09:22	Here is an example of the error we are seeing: A database query syntax error has occurred. This may indicate a bug in the software. The last attempted database query was: (SQL query hidden) from within function "MediaWikiBagOStuff::_doquery". MySQL returned error "1213: Deadlock found when trying to get lock; try restarting transaction (foo51-03)".

Issue History
Date Modified	Username	Field	Change
05-29-08 09:08	ferg	New Issue
05-29-08 09:22	ferg	Note Added: 0003109

Mantis 1.0.0rc3[^]

28 total queries executed.
25 unique queries executed.