Digital Edition

SYS-CON.TV
Next-Generation Backup Technologies for Linux
An entirely new media management paradigm for Linux users

Linux-based servers are fast becoming the low-cost alternative to higher-priced proprietary Unix/Windows environments. Finding a cost-effective storage management solution to support different environments can be a challenge. As enterprise IT environments grow and change, cost-conscious administrators are constantly on the lookout for more efficient ways to scale out storage configurations. So what's available to quickly and flexibly meet these pressing business needs?

A New Approach to Media Management

Enter the Virtual Disk Library (VDL, also referred to as Virtual Tape Library by many in the industry). A VDL solution allows users to emulate or "virtualize" a tape resource (tape, tape drive, tape library) on disk. When combined with the right backup application a VDL enables the efficient use of disk technologies for disk-to-disk backup and restores. It looks and feels like a tape library. But it sets up an entirely new media management paradigm for Linux users. With VDL, tape becomes a strategic component of a data protection strategy, not its major element. Managers can create multiple duplications of backup jobs from the VDL to tape, or vice versa. Administrators can set up specific backup policies, such as retention dates, rotation schemes, and media groups. VDL also allows save sets to be accessed wherever they reside. Incremental saves can be sent for fast restores from disk. Storing backup data on a VDL allows data copy jobs to be run off-line, without impacting the network, application servers, or workstations. VDLs are immune to the mechanical afflictions of tape backup over a network, such as shoe-shining or a slow data stream host. They can capture data in drips or blasts, arriving at virtual media slots as a save-set, which brings us to how VDLs are constructed.

Deconstructing the VDL

Based on a modular, object-oriented architecture, the software that runs VDL lets Linux administrators integrate tape backup and restore operations seamlessly with a variety of databases and messaging applications, storage devices, and storage area networks (SANs). (See Figure 1).

The VDL is basically a directory structure on disk consisting of directories called drives and slots. Numbered directories reside under each of these directories and each numbered directory defines a unique slot or drive. A media file resides in each slot-numbered directory. These are the virtual library's "tapes." Figure 2 shows the backup process with a VDL.

The virtual libraries are viewed as real physical libraries. The more drives the VDL contains, the more simultaneous backups can be performed. Virtual libraries always have many more slots than drives and are usually configured with a minimum of eight slots. Having extra slots allows for the proper handling of backup retention cycles. In addition, different operating systems may impose limits on the maximum file size, which can affect the number of slots needed. When the system is configured for Linux and the number of slots and media capacity are defined, media files are created and the space is pre-allocated.

Administrators can also install application plug-in modules for Oracle, MySQL, Sybase, PostgreSQL, Informix, and various other applications. The modules automatically add application-specific components to the backup and restore selection criteria that appear on the system's graphical user interface. From this common GUI, Linux administrators can manage all backup and restore operations across a storage area network (SAN), network attached storage (NAS), wide area network (WAN), or local area network (LAN).

When to Use a VDL

VDL Staging can be useful in two areas. If a company has a huge file system with millions of files, a typical server might not be able to read these files fast enough to stream today's high-performance tape drives. This can lead to shoe-shining and premature drive or media failures. There's no shoe-shining with disk storage, so there's no downside from slow performance when backing up to a VDL.

On the other hand, if the backup window is too small to back up several clients onto a limited number of tape drives, a VDL with enough virtual drives could back up all clients simultaneously. Performance here would hinge on network bandwidth, requiring a gigabit network to handle the load. For example, if a Linux user wanted to back up five clients in one hour, each with 10GB of data and only one tape drive performing at 18GB/hour, the backup window would be too small. With disk staging the user could back up to multiple virtual tape drives first, copy to physical tape, and define enough virtual tape drives to complete back up all the clients within an hour.

VDL Staging vs Multiplexing

There are essentially two approaches to multiple clients backing up to limited tape drives in short backup windows: multiplexing and staging.

In multiplexing, multiple streams of backup data are sent to one tape device. This results in a number of drawbacks. For one, a backup of any given client will span more tape than is actually required, which calls for handling multiple tapes per client backup. There's also a higher probability of failure if one of the media fails, since more media is used for any given backup. Restores are longer because more tape needs to be scanned for a given restore time, since data must be reconstructed from multiple data streams. Multiplexing also uses more CPU time on the backup server because data streams must be reorganized and packed into a multiplexed stream. This can create performance problems with today's high-speed tape devices.

With the VDL staging approach, extra disk space is required for the virtual library resource allocation. But each client's backups are always contiguous on tape, which uses less tape and speeds up restores.

Consolidating File System Backups

Combining VDLs and tape for backup opens up a variety of different strategies that answer the need for increased data protection, faster backups and restores, even reduced data vulnerability through multiple copies. A key strategy involves consolidating file system backups.

Consolidated backups let users create a "synthetic" full backup without running a weekly full backup. Full backups are very resource intensive. They can consume a considerable amount of network bandwidth (especially when backing up across the LAN) and server bandwidth that may be better used elsewhere. Consolidating a file system backup also won't consume system resources (network or application server bandwidth), freeing administrators to run full backups anytime they want without impacting production. Although it consumes the backup server's resources and VDL/tape resources, these are typically not in use during normal business hours. Running consolidated full backups makes it easier to run backups during normal hours, resulting in a "good" backup, since its progress will likely be monitored (see Figure 2).

Before running a consolidating back-up on a Linux system, users should first determine the VDL size, configuration, and location. How big will the library be? For a tight backup window with several clients backed up simultaneously, the VDL must be big enough to accommodate all the data for every client. If the server has a very large file system, clients may only need enough space in their VDL to handle one back-up at a time.

Another factor to consider is VDL geometry in terms of drives and slots. The number of simultaneous backup jobs a client needs to run will dictate the number of drives required. The number of slots will be dictated by the total size of the defined VDL. Users should know how much avail-able free disk space they have before attempting to create the virtual library. Finally, there's the decision of where to create the VDL. This is usually done on the backup server. The actual physical library or tape drive must be configured, tested, and made good to go by the client.

The Goal...Simplify Systems Management

The goal of any enterprise-wide storage system is to simplify systems management in heterogeneous environments. VDLs create a backup and restore option for scalable enterprise computing environments. One that allows IT staff to administer both tape-based and disk-based storage from a common GUI for better efficiencies and cost economies. For a freely distributable, multiplatform operating system like Linux, VDL presents a viable storage management solution.

-SIDEBAR-

Data Protection Considerations

Below are some factors to consider when determining appropriate backup policies:
  • Time to recovery objectives
  • Total amount of data to be backed up
  • Backup window (i.e., amount of time in which to complete a backup)
  • Type and speed of network infrastructure (LAN, SAN, WAN)
  • Location of the data (Local/Remote)
  • Type and speed of backup media being used (disk vs. tape)
  • Length of time that you must keep data
  • Budget for media
  • Number of data files to be backed up
About Jet Martin
Jet Martin serves as director of product management for San Diego–based BakBone Software (www.bakbone.com; TSX: BKB; OTC Bulletin Board: BKBOF), an international data protection solution provider that develops and distributes data backup, restore, and disaster recovery software for network storage and open-systems environments worldwide.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1



ADS BY GOOGLE
Subscribe to the World's Most Powerful Newsletters

ADS BY GOOGLE

CloudEXPO | DevOpsSUMMIT | DXWorldEXPO Silicon Valley 2019 will cover all of these tools, with the m...
Lori MacVittie is a subject matter expert on emerging technology responsible for outbound evangelism...
Technological progress can be expressed as layers of abstraction - higher layers are built on top of...
"Calligo is a cloud service provider with data privacy at the heart of what we do. We are a typical ...
Having been in the web hosting industry since 2002, dhosting has gained a great deal of experience w...
NanoVMs is the only production ready unikernel infrastructure solution on the market today. Unikerne...
SUSE is a German-based, multinational, open-source software company that develops and sells Linux pr...
Your job is mostly boring. Many of the IT operations tasks you perform on a day-to-day basis are rep...
When building large, cloud-based applications that operate at a high scale, it’s important to mainta...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, disc...
Big Switch's mission is to disrupt the status quo of networking with order of magnitude improvements...
Dynatrace is an application performance management software company with products for the informatio...
In his session at 21st Cloud Expo, Michael Burley, a Senior Business Development Executive in IT Ser...
All in Mobile is a mobile app agency that helps enterprise companies and next generation startups bu...
Yottabyte is a software-defined data center (SDDC) company headquartered in Bloomfield Township, Oak...
Serveless Architectures brings the ability to independently scale, deploy and heal based on workload...
Whenever a new technology hits the high points of hype, everyone starts talking about it like it wil...
Every organization is facing their own Digital Transformation as they attempt to stay ahead of the c...
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (No...
Chris Matthieu is the President & CEO of Computes, inc. He brings 30 years of experience in developm...