Digital Edition

SYS-CON.TV
Preserving Storage I/O for Critical Applications
Is hypervisor side I/O throttling enough?

While server virtualization has unprecedented advantages, it has thrown some challenges as well. Prior to server virtualization, each application used to run on a dedicated server with dedicated storage connected through dedicated switch ports. Needless to say, this approach had multiple limitations in terms of lack of flexibility, underutilization, etc. But, it provided guaranteed hardware resources such as CPU, network, memory, and storage bandwidth for each application.

With server virtualization, the underlying hardware resources are shared. Hypervisors do a good job in terms of providing guaranteed CPU power, network bandwidth, and memory for each application that shares these resources. However, when it comes to storage, things get complicated. Hypervisors don't have access to resources running inside the storage arrays and can only control access to storage through I/O throttling. This is not enough to preserve the storage I/O for critical applications. As a result, critical applications stay sidelined and the old school method of dedicated hardware approach prevails.

Many factors contribute to the complexity of storage even though the general perception is that storage is just a piece of hardware. In reality, there is a complex piece of software running in the system utilizing various hardware components such as CPU, cache, disk, and network, inside the storage array. Incoming I/Os add various amounts of pressure on these resources based on the traffic pattern. For example, 100 IOPS from Oracle is quite different from 100 IOPS from MS Exchange as far as the storage array is concerned. They are different in terms of read/write ratio, block size, random sequencial, disk placements, deduplication, compression, and so on.

Irrespective of I/O throttling at the hypervisor level, customers often complain: "My applications were running fine till yesterday; but why this sluggish performance now? There can be many factors contributing to this situation. Let us examine some of the common ones:

  • A new application replaces the old application, consuming storage from the array. From the hypervisor point of view, both the new and old applications are sending the same amount of IOPS towards the storage array. But, the new application completely wipes out cache for all the other applications in the storage array since it accesses the same data again and again which keeps this cache hot while depleting other caches.
  • Some applications change the traffic patterns drastically. The classic example is that of Accounting applications that generate month-end or quarter-end reports, depleting the storage resource for other ones. The hypervisor doesn't differentiate between regular I/O and report generation I/O.
  • An application creates and deletes large files. As far as the hypervisor is concerned, that's just a few I/Os, but that strains the storage array resources heavily.
  • One volume is configured for more frequent snapshot creation and deletion; something that is completely inaccessible to the hypervisor.
  • An application that accesses historical data starts to dig data from passive tiers, transparent to the hypervisor, but adds more load on the storage arrays.
  • A filesystem in the storage array is aging and results in more fragmentation. This costs more storage resources to pull out the same amount of data as before. But the hypervisor has no clue about these aspects.
  • Storage arrays' in-house activities, such as garbage collection reduce the overall capability of the array. Here, a simple hypervisor I/O throttling doesn't guarantee the critical applications the required IOPS.
  • Storage arrays' component failures result in reduced capabilities. The hypervisor can't do much to maintain the I/O level for critical applications.

No. Hypervisor side I/O throttling is not just enough. Storage needs to be intelligent to deliver differential, consistent, guaranteed IOPS to each application or virtual machines sharing the storage array. To get end-to-end SLA, all the three layers of the data center such as hypervisor/server, network, and storage need to satisfy individual SLAs. One component cannot act on behalf of the other.

All these years of experience in networking and storage have taught me: storage is the most complex part when it comes to delivering SLA.

About Felix Xavier
Felix Xavier is Founder and CTO of CloudByte. He has more than 15 years of development and technology management experience. He has built many high-energy technology teams, re-architected products and developed features from scratch. Most recently, Felix helped NetApp gain leadership position in storage array-based data protection by driving innovations around its product suite. He has filed numerous patents with the US patent office around core storage technologies. Prior to this, Felix worked at Juniper, Novell and IBM, where he handled networking technologies, including LAN, WAN and security protocols and Intrusion Prevention Systems (IPS). Felix has master’s degrees in technology and business administration.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1



ADS BY GOOGLE
Subscribe to the World's Most Powerful Newsletters

ADS BY GOOGLE

The explosion of new web/cloud/IoT-based applications and the data they generate are transforming ou...
CI/CD is conceptually straightforward, yet often technically intricate to implement since it require...
Containers and Kubernetes allow for code portability across on-premise VMs, bare metal, or multiple ...
Enterprises are striving to become digital businesses for differentiated innovation and customer-cen...
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As au...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't com...
DXWorldEXPO LLC announced today that All in Mobile, a mobile app development company from Poland, wi...
The now mainstream platform changes stemming from the first Internet boom brought many changes but d...
DXWorldEXPO LLC announced today that Ed Featherston has been named the "Tech Chair" of "FinTechEXPO ...
Chris Matthieu is the President & CEO of Computes, inc. He brings 30 years of experience in developm...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: D...
Andi Mann, Chief Technology Advocate at Splunk, is an accomplished digital business executive with e...
In this presentation, you will learn first hand what works and what doesn't while architecting and d...
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids...
To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitori...
We are seeing a major migration of enterprises applications to the cloud. As cloud and business use ...
If your cloud deployment is on AWS with predictable workloads, Reserved Instances (RIs) can provide ...
Disruption, Innovation, Artificial Intelligence and Machine Learning, Leadership and Management hear...
We build IoT infrastructure products - when you have to integrate different devices, different syste...
Consumer-driven contracts are an essential part of a mature microservice testing portfolio enabling ...