Most Read This Week
Next-Generation Backup Technologies for Linux
An entirely new media management paradigm for Linux users
By: Jet Martin
Jul. 27, 2004 12:00 AM
Linux-based servers are fast becoming the low-cost alternative to higher-priced proprietary Unix/Windows environments. Finding a cost-effective storage management solution to support different environments can be a challenge. As enterprise IT environments grow and change, cost-conscious administrators are constantly on the lookout for more efficient ways to scale out storage configurations. So what's available to quickly and flexibly meet these pressing business needs?
A New Approach to Media ManagementEnter the Virtual Disk Library (VDL, also referred to as Virtual Tape Library by many in the industry). A VDL solution allows users to emulate or "virtualize" a tape resource (tape, tape drive, tape library) on disk. When combined with the right backup application a VDL enables the efficient use of disk technologies for disk-to-disk backup and restores. It looks and feels like a tape library. But it sets up an entirely new media management paradigm for Linux users. With VDL, tape becomes a strategic component of a data protection strategy, not its major element. Managers can create multiple duplications of backup jobs from the VDL to tape, or vice versa. Administrators can set up specific backup policies, such as retention dates, rotation schemes, and media groups. VDL also allows save sets to be accessed wherever they reside. Incremental saves can be sent for fast restores from disk. Storing backup data on a VDL allows data copy jobs to be run off-line, without impacting the network, application servers, or workstations. VDLs are immune to the mechanical afflictions of tape backup over a network, such as shoe-shining or a slow data stream host. They can capture data in drips or blasts, arriving at virtual media slots as a save-set, which brings us to how VDLs are constructed.
Deconstructing the VDLBased on a modular, object-oriented architecture, the software that runs VDL lets Linux administrators integrate tape backup and restore operations seamlessly with a variety of databases and messaging applications, storage devices, and storage area networks (SANs). (See Figure 1).
The VDL is basically a directory structure on disk consisting of directories called drives and slots. Numbered directories reside under each of these directories and each numbered directory defines a unique slot or drive. A media file resides in each slot-numbered directory. These are the virtual library's "tapes." Figure 2 shows the backup process with a VDL.
The virtual libraries are viewed as real physical libraries. The more drives the VDL contains, the more simultaneous backups can be performed. Virtual libraries always have many more slots than drives and are usually configured with a minimum of eight slots. Having extra slots allows for the proper handling of backup retention cycles. In addition, different operating systems may impose limits on the maximum file size, which can affect the number of slots needed. When the system is configured for Linux and the number of slots and media capacity are defined, media files are created and the space is pre-allocated.
Administrators can also install application plug-in modules for Oracle, MySQL, Sybase, PostgreSQL, Informix, and various other applications. The modules automatically add application-specific components to the backup and restore selection criteria that appear on the system's graphical user interface. From this common GUI, Linux administrators can manage all backup and restore operations across a storage area network (SAN), network attached storage (NAS), wide area network (WAN), or local area network (LAN).
When to Use a VDLVDL Staging can be useful in two areas. If a company has a huge file system with millions of files, a typical server might not be able to read these files fast enough to stream today's high-performance tape drives. This can lead to shoe-shining and premature drive or media failures. There's no shoe-shining with disk storage, so there's no downside from slow performance when backing up to a VDL.
On the other hand, if the backup window is too small to back up several clients onto a limited number of tape drives, a VDL with enough virtual drives could back up all clients simultaneously. Performance here would hinge on network bandwidth, requiring a gigabit network to handle the load. For example, if a Linux user wanted to back up five clients in one hour, each with 10GB of data and only one tape drive performing at 18GB/hour, the backup window would be too small. With disk staging the user could back up to multiple virtual tape drives first, copy to physical tape, and define enough virtual tape drives to complete back up all the clients within an hour.
VDL Staging vs MultiplexingThere are essentially two approaches to multiple clients backing up to limited tape drives in short backup windows: multiplexing and staging.
In multiplexing, multiple streams of backup data are sent to one tape device. This results in a number of drawbacks. For one, a backup of any given client will span more tape than is actually required, which calls for handling multiple tapes per client backup. There's also a higher probability of failure if one of the media fails, since more media is used for any given backup. Restores are longer because more tape needs to be scanned for a given restore time, since data must be reconstructed from multiple data streams. Multiplexing also uses more CPU time on the backup server because data streams must be reorganized and packed into a multiplexed stream. This can create performance problems with today's high-speed tape devices.
With the VDL staging approach, extra disk space is required for the virtual library resource allocation. But each client's backups are always contiguous on tape, which uses less tape and speeds up restores.
Consolidating File System BackupsCombining VDLs and tape for backup opens up a variety of different strategies that answer the need for increased data protection, faster backups and restores, even reduced data vulnerability through multiple copies. A key strategy involves consolidating file system backups.
Consolidated backups let users create a "synthetic" full backup without running a weekly full backup. Full backups are very resource intensive. They can consume a considerable amount of network bandwidth (especially when backing up across the LAN) and server bandwidth that may be better used elsewhere. Consolidating a file system backup also won't consume system resources (network or application server bandwidth), freeing administrators to run full backups anytime they want without impacting production. Although it consumes the backup server's resources and VDL/tape resources, these are typically not in use during normal business hours. Running consolidated full backups makes it easier to run backups during normal hours, resulting in a "good" backup, since its progress will likely be monitored (see Figure 2).
Before running a consolidating back-up on a Linux system, users should first determine the VDL size, configuration, and location. How big will the library be? For a tight backup window with several clients backed up simultaneously, the VDL must be big enough to accommodate all the data for every client. If the server has a very large file system, clients may only need enough space in their VDL to handle one back-up at a time.
Another factor to consider is VDL geometry in terms of drives and slots. The number of simultaneous backup jobs a client needs to run will dictate the number of drives required. The number of slots will be dictated by the total size of the defined VDL. Users should know how much avail-able free disk space they have before attempting to create the virtual library. Finally, there's the decision of where to create the VDL. This is usually done on the backup server. The actual physical library or tape drive must be configured, tested, and made good to go by the client.
The Goal...Simplify Systems ManagementThe goal of any enterprise-wide storage system is to simplify systems management in heterogeneous environments. VDLs create a backup and restore option for scalable enterprise computing environments. One that allows IT staff to administer both tape-based and disk-based storage from a common GUI for better efficiencies and cost economies. For a freely distributable, multiplatform operating system like Linux, VDL presents a viable storage management solution.
Data Protection ConsiderationsBelow are some factors to consider when determining appropriate backup policies:
Reader Feedback: Page 1 of 1
Subscribe to the World's Most Powerful Newsletters
Today's Top Reads