RAID as an Alternative to Deduplication

Better storage software can provide features that deliver superior capacity efficiencies, which offer an alternative to deduplication. In this series we will examine how better storage software, can deliver these efficiencies without negatively impacting performance, raising costs, or placing data at risk.  This first article will look at an unexpected source, RAID, as an alternative to deduplication. In part two, we’ll cover how capabilities like high per drive performance, next-generation snapshot technology, and advanced tiering technology can further improve efficiencies. Finally, we’ll conclude with deduplication’s future to see if it has a role in storage infrastructures over the next two years.

Alternative to Deduplication is Better Storage Software
Register for our Virtual Whiteboard Session

Understanding the Total Cost of Dedupe

The primary goal of using deduplication on primary storage systems is to make advanced storage technologies like flash SSDs more affordable. A 3:1 efficiency rate enables 100TBs of storage to look like 300TBs of storage. The problem is that delivering a 3:1 effective rate requires high-end CPUs and more RAM. This way the algorithm can work without noticeably impacting performance. The capacity savings has to be enough to offset the costs of the additional hardware. When flash storage was $14 per GB, justifying the cost of additional hardware resources was easy. Today, now that flash is $.30 per GB, it is almost impossible. To learn about the total cost of deduplication, read our white paper, “Exposing the High Cost of Deduplication.”

Fast Rebuild Speed is Better than Dedupe

Most primary storage systems on the market today use traditional RAID algorithms to protect against media failure. An efficient RAID algorithm takes away much of the capacity gains that deduplication claims to deliver. The number one reason for this is slow rebuild times, even on AFAs. While very few vendors report their rebuild times, customers repeatedly tell us that the time it takes to return to a protected state after media failure is measured in multiple hours. With high-capacity flash drives, we’ve seen reports of competitors’ systems taking ten or more hours. Our customers measure StorONE’s vRAID rebuild times in single digit minutes. vRAID is an ideal alternative to deduplication because it saves real capacity instead of “mathematical” capacity.

Better storage software can provide features that provide superior capacity efficiencies, which offer an alternative to deduplication.

What does rebuild speed have to do with improving capacity efficiency? It has more to do with the impact of slow rebuilds on capacity efficiency. If you know you are going to be facing double digit hours in the rebuild process, then you also know your chances of having another drive fail during that process increases dramatically. Also, TLC Flash and especially QLC flash are vulnerable to continuous write IO, which is a big part of a rebuild. These double-digit hours rebuild times mean you are vulnerable to a second failure in your working set of drives, and your newly deployed hot spare is at particular risk. Additionally, of course, another drive failure means total data loss and performing a recovery from backup copies.

Better storage software can provide features that provide superior capacity efficiencies, which offer an alternative to deduplication.

To overcome a double drive failure risk, most IT planners will utilize a double parity or even triple parity RAID technique. Because of increases in drive densities, you are now dedicating 30-45TBs (assuming 16TB flash drives) of capacity per RAID group just to provide redundancy.

StorONE’s vRAID rebuilds volumes, made up of flash drives, in less than five minutes while other production IO operations continue. Our customers rarely use more than one drive parity in their flash-based volumes. They know they will be back into a protected state in less than five minutes. Additionally, because we leverage a high-performance form of erasure coding, most of our customers only see a 15% to 20% overhead to redundancy.

No Hot Spares are Better Than Dedupe

RAID, to protect you from data loss, needs drive replacements to even start the rebuild process. Enterprise storage systems help you make certain reserves are ready by the use of global hot spares. Most data centers allocate two hot spares per media type and size. For example, suppose a system has 16TB flash drives and 8TB flash drives. In that case, the customer will allocate two drives for each media size. This is because most traditional RAID won’t allow the mixing of various drive capacities within a given group. This example means the typical customer is dedicating 48TBs of capacity just for hot spares.

At StorONE, our vRAID does not require hot spares, and it can mix media capacities within volumes. If a drive fails, we simply rebalance the data on the failed drive across the remaining drives. The elimination of hot spares enables us to simultaneously rebuild and use all of the available drive’s capacity. These capabilities also provide easy expandability in the future. With StorONE, add the highest density and most cost-effective flash drives on the market and enjoy their full capacity.

Conclusion

Deduplication is no longer a must have, in fact, it may be a “better of without it” feature. Better RAID can give back hundreds of TBs of capacity in areas where deduplication can’t. vRAID is just one example of how better storage software can provide an alternative to deduplication and improve capacity efficiency. In part two of this series, we cover how technology like high performance per drive, advanced snapshots, and intelligent auto-tiering can fulfill all of deduplication’s promises without the performance and data integrity risks.

Posted in
George Crump

George Crump

George has over 25 years of experience in the storage industry, holding executive sales and engineer positions. Before joining StorONE, he was the founder and lead analyst at Storage Switzerland.

What to Read Next

How to Bypass the Compromises of Legacy RAID Architectures

Traditional storage architectures force the IT professional to sacrifice either on cost or on performance, in order to obtain data protection services such as snapshots and erasure coding. This is no longer acceptable in a business environment that increasingly does not tolerate compromise on data integrity or on application performance, and that requires maximum levels of utilization of hardware resources. […]
Read More

Data Integrity: The Backbone of Competitive Advantage

Data is the foundation of business advantage in today’s economy. Analytics and artificial intelligence (AI) are helping businesses to uncover new competitive opportunities and to operate in a more efficient and streamlined fashion. At the same time, requirements for data privacy are higher than ever before, because consumers are becoming more discerning about how their […]
Read More

How to Reduce the Cost of Storage Operations

Storage managers have always been pressured to do more with less. That pressure intensifies as the volume of data explodes, as the number of performance-hungry workloads grows, and as faster-performing but also more expensive storage technologies such as solid-state drives (SSDs) and non-volatile memory express drives (NVMe) enter the equation. Delivering the throughput, processing power, […]
Read More

Learn More About the Hidden Cost of Dedupe

  • This field is for validation purposes and should be left unchanged.