Most Read This Week
Four Ways Cloud Has Influenced Application Troubleshooting By @Stackify | @CloudExpo [#Cloud]
As with any new, disruptive technology, new challenges are also par for the course
By: Stackify Blog
Nov. 23, 2014 11:45 PM
Four Ways Cloud Has Influenced Application Troubleshooting
The rise of cloud computing has ushered in an era of unprecedented productivity for developers over the past several years. For those who have embraced this new world order, gone are the days of long lead times for hardware procurement and installation, architecture defined by slow-moving hardware upgrades, hardware-constrained scalability and flexibility, and a world where only sys admins have access to the infrastructure. But, as the barriers between development and delivery disappear, new challenges have emerged that can disrupt the lives of developers and slow down delivery of new products and features, giving back some of the efficiency gains that the Software-Defined Data Center (SDDC) created.
Whether you're new to the cloud or you've been around since before cloud was cool, you are likely to see four common challenges emerge that can make troubleshooting your applications in the cloud more difficult. Let's take a closer look at these common pain points first to help build awareness around the challenges, and then I'll offer some suggestions for how to prevent these hurdles from tripping you and your team up when it comes time to unravel an application troubleshooting mystery.
True, the ability to roll your own architecture without the burden of dealing with physical devices is liberating and far more efficient. But, as developer tools, deployment tools, and cloud operations tools become inextricably linked to one another, the old boundaries between who is dev and who is ops become blurred or even get removed altogether. This means the dev team is suddenly an integral part of operations, whether by design or by default, adding yet another responsibility for developers whose chief mandate is often to go faster. The more time you spend in the operations realm, especially in troubleshooting your app or the cloud resources it depends on, the less time you are able to devote to adding new value through code.
Lack of Transparency and Burden of Proof
App returning a 500 error or performing poorly? If you're using something delivered as-a-Service, such as database, queues, cache and the like, you won't really have any visibility into health other than the cloud provider's status page and whatever you can directly observe. It's either working correctly and is speedy, or it isn't; if it isn't, life gets a lot murkier. Likewise, servers can be monitored, but you can't really tell why your virtual resource's performance has trailed off if you are the victim of something environmental that's out of your control.
No matter how good the support team is at your favorite cloud provider, it's rare that they will be as responsive to your requests for more information on an issue as your own in-house ops team could be, and they won't be as well versed on your architecture. To varying degrees, you're at the mercy of the cloud provider for consistent, reliable services, and it's also up to them to offer timely insight when issues arise with the services you depend on. Your mileage may vary, of course, as to whether your cloud provider offers this level of communication and transparency. But, if they don't, then the burden of proof rests squarely with you to show that the issue isn't in your app. Quite a reversal of fortunes, isn't it?
The incredible thing about an SDDC is that you can create nearly any kind of architecture required to support your application stack's needs, all relatively easily - if you can dream it, you can build it. Want to cobble together .NET, Java, PHP, Node.js, Ruby, Database-as-a-Service for SQL and NoSQL, Message-Queues-as-a-Service, and Search-as-a-Service? From a cloud deployment perspective, it's been made devilishly easy to deploy and get started. But with that ultra-polyglot approach and a heavy reliance on software-defined services comes a new set of challenges:
More Frequent Change
A big part of the movement toward Agile and Lean is also the notion of always moving forward - rather than rolling back a release in the event of an issue, detect problems early and patch them quickly. To enable this mandate, however, requires two things that are often missing if you are coming from a slower moving environment or from a more traditional hosting model:
Without this, it's hard to know if you've made gains or losses with your release - your users are often your only real barometer.
So... How Do I Code More and Support Less?
What can development teams to do adapt to and overcome these challenges?
There are three basic steps that every development team should take to make supporting cloud-based applications easier.
1. Establish Access, Process, and Protocol: The first order of business for helping developers support their cloud-based apps more effectively is giving them safe access to the information and resources they need. Unfortunately, all too often in cloud environments this is an all-or-nothing proposition - full login rights to servers and even potentially full rights to the management portal, or no access at all. Make sure to establish the correct access methods to your developers so that they have the visibility and access they need, without handing over so much control that it increases the likelihood of accidents.
2. Design Supportability Into the Application: Once your application is in production, there are several common questions that you will need to be able to answer at a moment's notice about your application: Is it (and everything it depends on) running? Are users satisfied with the performance? Is anything silently failing and frustrating users without setting off alarms? If something failed, who was impacted, and what caused the issue?
There are also some things that simply cannot be measured and monitored from outside the application, but which speak directly to the health and well being of your application. To enable you to quickly answer the inevitable questions, consider incorporating the following:
3. Identify Health Baselines Early: Key information like message queue length, average request time, app pool resource utilization, custom metrics values, log and error rates, and more can all be charted for your application these days - monitoring and charting isn't just the domain of ops tools any longer. Understand what your app looks like both when healthy and unhealthy, preferably starting with pre-production environments even, so that you can see how your application morphs from release to release as well as with different loads and as your architecture evolves. By baselining as far back as dev and QA, you can often catch problems well before they impact customers and send you and your team scrambling.
At Stackify we believe we offer a solution to the issues presented in this article learn more at www.stackify.com
Reader Feedback: Page 1 of 1
Subscribe to the World's Most Powerful Newsletters
Today's Top Reads