Most Read This Week
Sync Your Timeouts: When Load Balancers Cause Database Deadlocks
Have you seen this error message before “java.sql.Exception: ORA-00060: deadlock detected while waiting for resource”?
By: Andreas Grabner
Apr. 14, 2014 11:00 AM
Have you seen this error message before "java.sql.Exception: ORA-00060: deadlock detected while waiting for resource"?
This is caused when parallel updates require locks on either rows or tables in your database. I recently ran into this exception on an instance of an IBM eCommerce Server. The first thought was that there are simply too many people hitting the same functionality that updates Sales Tax Summary information - which was showing up in the call stack of the exception:
Exception stack trace showing that createOrderTaxes ran into the deadlock issue on the database
The logical conclusion would be to blame this on too many folks accessing this functionality or outdated table statistics causing update statements to run too long causing others to run into that lock. It turned out to be caused by something that wasn't that obvious and wouldn't have shown up in any Exception stack traces or log files. A misconfigured timeout setting on the load balancer caused a re-execute of the original incoming web request. While the first app server was still updating the table and holding the lock - as it had a longer timeout specified as the load balancer - the second app server tried to do the same thing causing that exception.
In this article I'll show you the steps necessary to analyze the symptoms (timeouts and client errors) and to identify and fix the root cause of the problem.
Step #1: Identifying Who and What Is Impacted
Linking the errors to the User Action reveals that the problem happens when adding items to the shopping cart
This impacts our business.
Now we know that this problem impacts a critical feature in our app: Users can't add items to their cart.
Step #2: Understanding the Transaction Flow
Transaction Flow highlights several hotspots such as 33k SQL Executions in Total and Load Balancer (IHS) splitting up a request
This Is an Architectural Problem
For steps 3 & 4, and for a list of key takeaways, click here for the full article
Reader Feedback: Page 1 of 1
Subscribe to the World's Most Powerful Newsletters
Today's Top Reads