TechNote:
Hierarchical Storage Management
Hierarchical Storage Management (HSM) is becoming increasingly
important for maximizing the return on enterprise investments in
storage resources. HSM can be a part of, but not the total answer to
Information Lifetime Management (ILM). Often, high-performance storage
is being used to store data that is for reference only, or for data
that is older and thus now infrequently used.
By implementing an automated system and policies for migration of
such data to less expensive disk storage, or to tape, the useful life
of existing storage capacity resources can be indefinitely extended. An
HSM system permits automated and transparent recovery of data from tape
or secondary disk storage upon demand, while retaining the protected
tape copy or copies locally or off-site.
Secondary disk storage can be composed of FC-connected ATA or Serial
ATA RAID drive arrays. Architectures for replicating data to remotely
sited tape are possible.
SAN-Aware HSM
Integration of HSM capabilities in a SAN architecture offers many
advantages, but the HSM software must be SAN-aware. ADIC's StorNext
Management Suite allows a mixed OS environment to share common
access to a single set of files, and provides the best integration of
HSM and SAN capabilities today (http://www2.adic.com/stornext/).
StorNext supports a wide variety of OS's including Windows, Solaris,
Linux, and AIX. An ADIC case study is available on the ADIC website (http://www.adic.com/us/collateral/CS_AFRL.pdf).
It describes InfraStor's installation of StorNext in a 40TB SAN
environment along with an ADIC Scalar 10K, 250TB tape library.
ADIC’s latest iPlatform i500 and i2000 libraries
provide integrated FC connectivity with “pay-as-you-grow” capacity
scalability from a department-level 34 cartridges to an
enterprise-capable 3,492 tape cartridges.
Conventional IP-based HSM
CaminoSoft's (www.caminosoft.com)
Managed Server HSM is an example of a more traditional HSM
implementation that allows easy scalability for implementing HSM on
just one server, or many. The CaminoSoft Managed Server HSM is loaded
on a client application server and through a set of user-managed
policies migrates files to a central server resource. The secondary
server can be a NAS device, SAN-attached storage, or a second
CaminoSoft IP-connected device that manages a connection to a tape
library. CaminoSoft uses the Computer Associates media management
library, so that a wide range of devices can be supported.
Managed Server HSM is available for Windows and Netware, and in fact
CaminoSoft's product provides the only HSM solution for Netware today.
InfraStor is a CaminoSoft partner.
HSM and Email Archiving
HSM can also be part of an Email Archiving system; however, software
that combines Indexing and Archival capabilities with HSM needs to be
application-aware. For example, Symatec's
Enterprise Vault incorporates a policy-driven HSM capability to
migrate mail-server objects (Exchange and Lotus-Notes) off of the mail
server, leaving behind a "stub" pointer. The pointer allows
subsequent return of the migrated object to the mail server in an
automated fashion from secondary storage. Separately, Enterprise Vault
offers Journaling functions that are critical to maintaining Regulatory
Compliance.
HSM and File Indexing/Search Capabilities
An important adjunct to HSM as applied to data on a file server is
the ability to index files for later retrieval, even after they have
been migrated to secondary storage. As for HSM/Archival of Email, the
HSM application needs to be application and file-type aware. For
example, a spreadsheet, PDF, or word-processing document may each
qualify for migration from primary to secondary storage. By indexing
the files as part of the migration process, not only is the data
automatically recoverable via the "stub" pointer, it is also
recoverable as part of an arbitrary search of metadata created by the
full-text indexing process. This capability is offered by the Symantec
Enterprise Vault software, and is an important part of the vision for
ILM as well as a significant enabler of Regulatory Compliance.
HSM and Backup
While HSM utilities typically maintain more than one copy of the
original data, manufacturers are careful to emphasize that HSM is not a
replacement for regular backup, and HSM systems are not designed to
facilitate complete recovery of a server's data. What is true, is that
the reduction in the size of the data store on the source volume
contributes to very much faster backups. Regular backups of the source
volumes with the pointers, and the migrated data are essential.
Typically HSM systems hold migrated data in logical container volumes
and require the pointers to be able to locate individual files. Lower
frequency incremental backups of the migrated data can be scheduled
following scheduled migration tasks.
HSM and Continuous Data Protection
There are many similarities between Continuous Data Protection (CDP)
methods and the combined journaling and archiving systems that are
coupled with HSM; however at the current time they are still separate
functions. It is to be expected that greater overlap of these functions
will develop in the future.
InfraStor Technologies can provide the proper selection of
capabilities for implementing an HSM system, whether it is for a few
servers with Direct-Attached Storage or many devices on a SAN with
requirements for shared access or indexed archival for Regulatory
Compliance.