The Ever Growing Challenges of Data Storage

Bookmark and Share

Electronic data storage needs continue to grow. As companies produce more information in electronic format, storage space is becoming increasingly important. Managing data storage for performance, integrity, and scalability is the next summit in Information Technology management and planning.

To get the best performance and reliability from any storage space, strategic storage planning is essential. While much time is spent in equipment planning, just as much effort should be applied towards researching operating systems and their available file systems.

Let’s put things into perspective – how much space is 1TB? (see Table)

To that end, there are numerous storage options available today. There are so many options that this may contribute to confusion and data loss among IT departments. Why are large amounts of data challenging? A single storage area with millions of files may create a ‘black hole’ of data, where files go in but almost never are found again. Users can become frustrated with file retrieval performance because the file system has to ‘find’ a single file amongst hundreds of thousands of other files. This is where understanding how and what the file system is doing can help make proactive choices easier during the planning time of large storage systems.

File System Considerations


During server planning, more time and research is spent on hardware, data space requirements and application specifications than on how the data will be stored. The file system can become a low priority during the planning stages of a file or data server because the file system is inherent to the operating system. Sometimes it is assumed that this is the best fit. However, your storage requirements may call for a more robust method of data organization on the hard disk(s). Investigate whether the operating system you are planning to use allows for other file systems to be used.

If you have a choice of file systems, here are some requirements to consider:

  • Volume Size
  • Estimated number of files on the volume
  • Estimated size of files on the volume
  • Shared volume requirements
  • Backup Requirements

Volume Size
Volume size is an important place to start for planning. However, this is only the start since strategic planning involves scalability—can it grow as the need arises without interruption of service to the users? The axiom of filling free space is all too true for data volumes. It is not uncommon to add a terabyte of storage and in six months it’s already half full.

Two terabytes (2TB) has become the initial hurdle for many file systems. This limit starts with the SCSI command set being limited to 32-bit logical block addressing. Therefore, a single SCSI LUN using 512 byte block size cannot access over 2TB. File systems that have been used on these systems have been ‘adjusted’ to handle extremely large volumes. However, volumes that are nearing the 2TB limit may be stressing the limits of the file system.

How Much Is 1 Tb?

Number of Bytes What That Relates To
1 Byte One character (letter or number)
1KB (Kilobyte) 1000 bytes 3 or 4 typed manuscript style pages
1MB (Megabyte) 1,000,000 bytes Average size of a novel (300-400pgs) or 1 diskette
1GB (Gigabyte) 1,000,000,000 bytes Approximately 20 sets of encyclopedias
1TB (Terabyte) 1,000,000,000,000 bytes A small library (approx. 5,000 books)

Estimated Number of Files on the Volume
The next item to plan for is the number of files that could potentially be stored on the volume. The file system uses Metadata (data about data) to describe the files that are stored. This means there is going to be a certain amount of volume space used by the file system just to manage the files that are there.

File systems that are not built for excessively large directories will slow down applications that access them. This can adversely affect users that have thousands of files on a volume that has millions of files.

Estimated Size of Files on the Volume
The next consideration is the sizes of the files that will be on the volume. Organizations that are running large database servers usually have the need to be able to pre-allocate very large files in the gigabyte range of sizes. The file system and operating system need to be able to handle this level of input and output. For these types of enterprises’ systems, expectations are high for performance and integrity. Will the file system be able to handle those extremely large files?

Shared Volume Requirements
There are mixed environments in many organizations today. Some organizations may have three or four different platforms of computer systems; from mainframe systems to 64-bit Sun machines, from Apple desktops to Intel based machines. Some of these systems may share storage space. Will the volume support mixed data types? Additionally, will the operating system that manages the file system allow for different types of data streams to be accessed simultaneously?

Backup Requirements
Backup requirements for large storage systems are becoming one of the most important aspects of the storage scenario. Despite all the advancements in storage technology, only about 20%* of back-up jobs are successful (*according to Enterprise Strategy Group). Large storage systems with millions of files present a serious data archiving challenge – backing up the data. There may be so much of it that to successfully back up the data could take days. Conversely, if there was a disaster, the archived data may be out of date or will be out of date by the time the restoration is completed.

One solution is to utilize file systems that have ‘Snap-shot’ technology incorporated into the backup software. This technology saves critical file system metadata (file name, size, multiple time and dates, security details such as Read/Write/Execute/Delete privileges) that can assist in locating where the files are stored. Another solution is to regularly run incremental file backups. Incremental backups ensure that your files are being backed up on a regular schedule so there is a smaller window of unprotected data. Incorporating both of these backup methods create a better system of data archiving.

Recovery Capabilities for Large Volumes


Despite the best planning, failures do happen. The best way to avoid data loss disasters with large storage systems is to have a relationship with a well-respected data recovery company. The best data recovery companies monitor the technological advancements in the storage industry, including research and development in new hardware and file systems.

Top providers have their own software development staff to create proprietary recovery software for their data recovery labs. For example, due to the increase in large, multi-terabyte size volumes, the best recovery companies have updated their tools to meet the 2TB barrier.

Remote recovery technology has become the standard recovery process for large volumes because shipping in several drives is impractical. Even if a RAID configuration is lost or if one drive has failed, remote recovery can retrieve the original data. Not all recovery companies have this capability, however, so it is important to look for providers that can offer this unique service.

As mentioned previously, terabyte volumes are becoming more common. If you have errors or problems accessing large terabyte volumes, call a professional recovery service for assistance immediately. A qualified engineer will discuss all of the options available to get the volume accessible in the quickest manner.

Your storage needs will continue to grow – so partnering with a data recovery company for recovery services means that you’ll be protected if there is ever a data disaster.

Contributed by Jim Reinert, director of software and services at Ontrack Data Recovery

One Comment to “The Ever Growing Challenges of Data Storage”

  1. Menk
    August 13th, 2009 at 11:43 am

    The hospital management must make it sure that all the files of their patients are documented properly and that they should have a backup copy of them.