StorONE Blog

The High Cost of Slow RAID Rebuilds

The High Cost of Slow RAID Rebuilds

The big concern of slow rebuilds is they place your organization’s data at risk, but most IT professionals overlook the high cost of slow RAID rebuilds. A RAID rebuild is the ultimate test of a storage system’s efficiency. Can the storage system handle the rebuild workload, complete the process quickly, and return the system to a protected state while also serving existing workloads without degrading performance? 


Stop Blaming Hard Disk for Slow RAID Rebuilds

The answer for most storage systems is that they can’t balance rapid recovery and maintain acceptable application performance. These vendors are quick to blame hard disk drives, but it is not the hardware’s fault, as our recent rebuild test shows. Last quarter we released vRAID 2.0 and as Chris Mellor at Blocks & Files picked up on, our RAID rebuild times were 5X faster than any of our competitors. StorONE now has publically verified that it can complete a RAID rebuild of a volume made up of 14TB hard drives in less than 1 hour and 45 minutes. Most vendors take three days to complete the same process. Our hard disk rebuild time is also faster than most vendors’ flash rebuild times. It’s worth noting that we can rebuild a flash volume, even one with 15TB drives, in less than five minutes. If they support 15TB SSDs, most vendors will take 8-12 hours to rebuild the volume. 


Slow RAID Rebuilds Modifies Behavior

Suppose you are an IT Professional and you are facing rebuild times of double-digit hours for mission-critical volumes. In that case, you will rearchitect your storage design to mitigate the risk of a second or third drive failure during the rebuild process. If you are using hard disk drives and are facing rebuild times of multiple days, you will want more than three redundant drives. These modifications are where the high cost of RAID rebuilds impacts the IT budget. 


The Media Cost of Slow Rebuilds

The first cost of slow RAID rebuilds is that it forces you to use drives that are not optimized to reduce cost per GB or data center floor space. First, because of the egregiously slow rebuild time of hard disk-based volumes, many customers feel they have no choice but to run away from the technology and use flash. The problem is that the slow rebuild challenge is also now haunting all-flash arrays (AFA). The increasing rebuild times on AFAs means that vendors are forcing their customers to use 4TB or 8TB SSDs even though 15.3TB drives are readily available and 30TB drives are now on the market. 

Traditional AFA vendors need to force customers to use these smaller drives to keep the drive count high to meet performance requirements. They also need to keep drive capacities small to avoid exposing their rebuild inefficiencies to customers. Except for StorONE, no other vendor publishes their rebuild test results. However, based on what their former customers have told us, they saw flash rebuild times in the 8 to 12-hour range on 4TB-8TB drives in their previous systems. 

With StorONE, you can immediately adopt the highest density flash and hard disk drives as they come to market without concern for rebuild times or concern over performance loss. Our efficient storage engine delivers fast RAID builds no matter the drives’ density. 


The Redundancy Cost of Slow Rebuilds

The second cost of slow rebuilds is a redundancy cost. Again, IT professionals facing double-digit hour or days of rebuild time will allocate more drives to redundancy. The chances of another drive failure during these prolonged rebuild processes are enough that you have to prepare for it. Dual drive redundancy is a must for most customers, and some vendors are even promoting their “new” capability of triple drive redundancy. 

First, with StorONE’s vRAID, you can have an almost unlimited number of redundant drives, but you won’t need them because of our fast rebuild times. The chances of another SSD drive failing during a 3-5 minute flash rebuild are minimal. The chances of a second hard drive failure during a two-hour rebuild is almost as low. Many of our customers run their flash volume with only a single redundant drive and hard disk drives with only two drive redundancy. The key is the choice is yours. You can set the redundancy to what you are comfortable with, rather than a decision that is forced on you by inefficient storage software. 


The Hot Spare Cost of Slow Rebuilds

Legacy RAID technology is dependent on hot spares. These are drives that are on standby for when an active drive fails in your array. Hot spares prevent you from having to race to the data center so a rebuild can start, but they don’t avoid an unplanned trip to the data center. A hot spare that has taken the place of a failed drive must be replaced as soon as possible. IT professionals still need to quickly get to the data center to ensure they fully meet the hot spare requirements. Obviously, there is a management and operational cost attached to this process. 

There is also a hard cost of requiring hot spares, especially as drive capacities increase. It is best practice to have at least two hot spares per media type per system. Since most customers, because of single-dimensional storage software, need a storage system for every use case, hundreds of terabytes of additional capacity are sidelined. 

StorONE’s vRAID empowers you to use high-density drives, but we thought through the process and realized that if we gave you the ability to use 18TB HDDs and 30TB SSDs, you wouldn’t want to have to sideline that much storage for standby only. With vRAID, all drives are active all the time. When a drive fails, we bring each volume that had data on that drive back to a protected state (in record time) by redistributing data on the remaining available drives. The result is typically 100TB or so of capacity savings per storage system, no unplanned trips to the data center, and no data loss risk. 

StorONE not only eliminates the high cost of slow RAID rebuilds it also gives back capacity.


Learn More:

StorONE’s answer to the RAID rebuild challenge is its patented high-performance erasure coding-based vRAID, integrated into StorONE S1: Enterprise Platform. To learn more about how with StorONE’s vRAID you can adopt today’s and tomorrow’s high-capacity drives and benefit from their lower costs without concern over slow rebuild times, register for our white paper, “Understanding vRAID.”

Request a Demo