Digital Edition

SYS-CON.TV
Software Quality Metrics for Your Continuous Delivery Pipeline | Part 2
The metrics around database access

No matter how often you deploy your application or how sophisticated your delivery pipeline is, you always need to know the quality status of the software you are building. That can only be done if you measure it; but measure what exactly? In Part 1 we introduced the Concept of Quality Metrics in CD (Continuous Delivery) by looking at the metric # of Requests per End User Action. In Part 2 we will focus on metrics around database access.

You need to be aware of bad database access patterns right when they get introduced in your code. Whether the reason is incorrectly configured O/R (Object Relational) Mappers such as Hibernate, TopLink or JDO, or because of bad coding. Finding these problems immediately by looking at the right metrics will make it is easier for developers to fix the problem, which will reduce test cycles and give operations more confidence that a new deployment will not blow their current database server.

Examples of Bad Database Access Patterns
The following example is taken from a web application that displays an account report with 25 items per page that contains the names and high-level statuses of these accounts. The developers decided to go with Hibernate in order to access these account objects stored in the database. A closer look at the generated SQL queries when generating this report reveals that Hibernate not only loads the account objects that match the search query, but actually loads all account objects including all referenced objects in a related history table. These history objects are not needed at all to generate the report. The way Hibernate is used by the application (premature loading of all objects and referenced objects) results in more than 4000 SQL executions contributing 6s to the total page load time:

The way Hibernate is used by the application results in 4k+ individual SQL Statements, returning much more data than is actually needed for the report

If you want to learn more about database access problems check out load balancers cause database locks, when it is really the database to blame or the "Understanding Hibernate" Series: Part I - Session Cache, Part II - Query Cache and Part III - Second Level Cache.

Metric: Total Number of SQL Statements per Transaction
The first metric you want to take a closer look at is the total number of SQL executions per transaction. If you want to go a step further you can even monitor SELECT, INSERT, DELETE and UPDATE statements separately.

If you're always aware how many database statements are executed for your individual transactions (Login, Search, Checkout) and you monitor this along the delivery pipeline for every build, you will immediately see how the newly added functionality impacts the load on your database. The following screenshot shows a way to track this number across builds and across your different deployment stages. In this scenario, Developers extended the search feature in Build #3 by making an additional call to a 3rd party recommendation service. Build #3 suddenly shows a huge spike in SQL queries in the Load Stage and Production. Why is that?

A new call to an external third-party service introduced with Build 3 has major impacts on the load (capacity stage) and production environment when this new feature has to deal with real production data

What can we learn from these metrics above?

  • Commit stage: The executed Unit Tests in that stage didn't catch the problem because the call to the third-party service was mocked and therefore no actual DB calls were executed by that service.
  • Acceptance stage: A change in behavior was detected due to the additional call to the recommendation service which executed 2 additional SQLs. This could already be a warning sign but will probably still stay unnoticed.
  • Capacity stage: a 200x increase in DB calls must stop this deployment so that it never reaches Live. The reason why we have 200x and not only two should also trigger the integration tests to be executed against more than sample database content.
  • Production: 4k SQLs instead of just two is a huge impact on production. This again can be explained because the production database has "real life" data and the database access pattern of this third-party service queries every piece of data. If this really makes it into production and we monitor this data down to the transaction level, it is easy to pull this change back and engineering can immediately start working on the problematic area. Most important, however, is that this build never makes it into production because the problem was found in testing already!

How to Measure on Dev Workstations
Developers can look at this data by either turning on certain SQL Logging options of the frameworks that they are using, e.g., Hibernate (see stackoverflow discussion). They can profile their code using the profiler that comes with their IDE or use tools such as the Development Edition of dynaTrace (or the 15 Days Free Trial) to see all database calls made by their own code or the code that they are calling. The following shows a screenshot taken from Database Access Patterns Gone Wild and shows which data can be analyzed on a local machine:

Developers can analyze which SQL statements are executed by their own code or third-party frameworks they use. In this case it was code executed by Telerik to populate .NET control data.

For more measurement tips, and for further insight, click here for the full article

About Andreas Grabner
Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1



ADS BY GOOGLE
Subscribe to the World's Most Powerful Newsletters

ADS BY GOOGLE

With the proliferation of both SQL and NoSQL databases, organizations can now target specific fit-fo...
Organizations planning enterprise data center consolidation and modernization projects are faced wit...
Let’s face it, embracing new storage technologies, capabilities and upgrading to new hardware often ...
Fact: storage performance problems have only gotten more complicated, as applications not only have ...
Containers, microservices and DevOps are all the rage lately. You can read about how great they are ...
Traditional IT, great for stable systems of record, is struggling to cope with newer, agile systems ...
Cloud computing delivers on-demand resources that provide businesses with flexibility and cost-savin...
Disruption, Innovation, Artificial Intelligence and Machine Learning, Leadership and Management hear...
While some developers care passionately about how data centers and clouds are architected, for most,...
Chris Matthieu is the President & CEO of Computes, inc. He brings 30 years of experience in developm...
Your job is mostly boring. Many of the IT operations tasks you perform on a day-to-day basis are rep...
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming ou...
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news an...
DevOpsSummit New York 2018, colocated with CloudEXPO | DXWorldEXPO New York 2018 will be held Novemb...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: D...
With 10 simultaneous tracks, keynotes, general sessions and targeted breakout classes, @CloudEXPO an...
Containers and Kubernetes allow for code portability across on-premise VMs, bare metal, or multiple ...
CloudEXPO New York 2018, colocated with DXWorldEXPO New York 2018 will be held November 11-13, 2018,...
@DevOpsSummit at Cloud Expo, taking place November 12-13 in New York City, NY, is co-located with 22...
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As au...