Digital Edition

SYS-CON.TV
Preserving Storage I/O for Critical Applications
Is hypervisor side I/O throttling enough?

While server virtualization has unprecedented advantages, it has thrown some challenges as well. Prior to server virtualization, each application used to run on a dedicated server with dedicated storage connected through dedicated switch ports. Needless to say, this approach had multiple limitations in terms of lack of flexibility, underutilization, etc. But, it provided guaranteed hardware resources such as CPU, network, memory, and storage bandwidth for each application.

With server virtualization, the underlying hardware resources are shared. Hypervisors do a good job in terms of providing guaranteed CPU power, network bandwidth, and memory for each application that shares these resources. However, when it comes to storage, things get complicated. Hypervisors don't have access to resources running inside the storage arrays and can only control access to storage through I/O throttling. This is not enough to preserve the storage I/O for critical applications. As a result, critical applications stay sidelined and the old school method of dedicated hardware approach prevails.

Many factors contribute to the complexity of storage even though the general perception is that storage is just a piece of hardware. In reality, there is a complex piece of software running in the system utilizing various hardware components such as CPU, cache, disk, and network, inside the storage array. Incoming I/Os add various amounts of pressure on these resources based on the traffic pattern. For example, 100 IOPS from Oracle is quite different from 100 IOPS from MS Exchange as far as the storage array is concerned. They are different in terms of read/write ratio, block size, random sequencial, disk placements, deduplication, compression, and so on.

Irrespective of I/O throttling at the hypervisor level, customers often complain: "My applications were running fine till yesterday; but why this sluggish performance now? There can be many factors contributing to this situation. Let us examine some of the common ones:

  • A new application replaces the old application, consuming storage from the array. From the hypervisor point of view, both the new and old applications are sending the same amount of IOPS towards the storage array. But, the new application completely wipes out cache for all the other applications in the storage array since it accesses the same data again and again which keeps this cache hot while depleting other caches.
  • Some applications change the traffic patterns drastically. The classic example is that of Accounting applications that generate month-end or quarter-end reports, depleting the storage resource for other ones. The hypervisor doesn't differentiate between regular I/O and report generation I/O.
  • An application creates and deletes large files. As far as the hypervisor is concerned, that's just a few I/Os, but that strains the storage array resources heavily.
  • One volume is configured for more frequent snapshot creation and deletion; something that is completely inaccessible to the hypervisor.
  • An application that accesses historical data starts to dig data from passive tiers, transparent to the hypervisor, but adds more load on the storage arrays.
  • A filesystem in the storage array is aging and results in more fragmentation. This costs more storage resources to pull out the same amount of data as before. But the hypervisor has no clue about these aspects.
  • Storage arrays' in-house activities, such as garbage collection reduce the overall capability of the array. Here, a simple hypervisor I/O throttling doesn't guarantee the critical applications the required IOPS.
  • Storage arrays' component failures result in reduced capabilities. The hypervisor can't do much to maintain the I/O level for critical applications.

No. Hypervisor side I/O throttling is not just enough. Storage needs to be intelligent to deliver differential, consistent, guaranteed IOPS to each application or virtual machines sharing the storage array. To get end-to-end SLA, all the three layers of the data center such as hypervisor/server, network, and storage need to satisfy individual SLAs. One component cannot act on behalf of the other.

All these years of experience in networking and storage have taught me: storage is the most complex part when it comes to delivering SLA.

About Felix Xavier
Felix Xavier is Founder and CTO of CloudByte. He has more than 15 years of development and technology management experience. He has built many high-energy technology teams, re-architected products and developed features from scratch. Most recently, Felix helped NetApp gain leadership position in storage array-based data protection by driving innovations around its product suite. He has filed numerous patents with the US patent office around core storage technologies. Prior to this, Felix worked at Juniper, Novell and IBM, where he handled networking technologies, including LAN, WAN and security protocols and Intrusion Prevention Systems (IPS). Felix has master’s degrees in technology and business administration.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1



ADS BY GOOGLE
Subscribe to the World's Most Powerful Newsletters

ADS BY GOOGLE

As DevOps methodologies expand their reach across the enterprise, organizations face the daunting ch...
As Marc Andreessen says software is eating the world. Everything is rapidly moving toward being soft...
You know you need the cloud, but you’re hesitant to simply dump everything at Amazon since you know ...
ChatOps is an emerging topic that has led to the wide availability of integrations between group cha...
The need for greater agility and scalability necessitated the digital transformation in the form of ...
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an over...
The cloud era has reached the stage where it is no longer a question of whether a company should mig...
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection...
In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, provided a fun and ...
While some developers care passionately about how data centers and clouds are architected, for most,...
"Since we launched LinuxONE we learned a lot from our customers. More than anything what they respon...
Is advanced scheduling in Kubernetes achievable?Yes, however, how do you properly accommodate every ...
DevOps is under attack because developers don’t want to mess with infrastructure. They will happily ...
"As we've gone out into the public cloud we've seen that over time we may have lost a few things - w...
In his session at 21st Cloud Expo, Michael Burley, a Senior Business Development Executive in IT Ser...
Sanjeev Sharma Joins June 5-7, 2018 @DevOpsSummit at @Cloud Expo New York Faculty. Sanjeev Sharma is...
We are given a desktop platform with Java 8 or Java 9 installed and seek to find a way to deploy hig...
"I focus on what we are calling CAST Highlight, which is our SaaS application portfolio analysis too...
"Cloud4U builds software services that help people build DevOps platforms for cloud-based software a...
The question before companies today is not whether to become intelligent, it’s a question of how and...