As prices for Solid State Drives (SSDs) constantly come down, while Hard Disk Drive (HDD) prices are comparably stable, the question of switching to SSDs becomes more relevant for more and more applications.
Especially in corporate environments where technical choices are made based on performance and financial incentives, rather than an emotional level, knowing the basics of how SSDs work is crucial to make sustainable and profitable decisions.
As always, there is not one right decision for everyone. You have to analyse your processes, workflows and used data types to make an informed decision.
What We Are Dealing With
In our case, we deal with vast amounts of data daily, which have to be structured and analysed. We use data- streams in most cases and therefore not big monolithic files. That's why we have to incorporate many discrete reading processes for our analysis. What we write, apart from storing those small data packages, is mostly code, which hardly reaches big file sizes.
We don’t need extreme writing speeds. What we do need are good reading speeds. HDD and SSD manufactures love luring customers with really high hypothetical speeds for sequential reading of up to >500 MB/s. This number is totally irrelevant if you don’t have to read a few files of >1 GB, but rather thousands and tens of thousands of small files of <1 MB. To get significant insight into how HDDs and SSDs perform in those reading operations you rather have to look for Random Read 4KB, QD1 and 4KB, QD32 benchmarks.
The State of HDDs
HDDs haven’t seen major innovations in more than a decade. The latest development, that is worth mentioning, is the advent of helium-filled HDDs in the high-end server storage market. Those helium filled HDDs are less likely to fail than air-filled HDDs as stated in Backblaze’s Hard Drive Stats for Q3 2018¹, but conceptually they are still prone to electromechanical failure with their moving parts and spinning disks. There is still no solution to head crashes, the scratching of the disk by the read/writing-head, caused by physical impact or a loss of power; a thread that can mostly just be eliminated by professional server room conditions. The life expectancy of a constantly in use HDD is around four years as documented by Backblaze². In contrast, the lifetime of SSDs solely relies on the number of P/E Cycles it goes through.
Just for Comparison:
These are selective metrics, but they are representative of the differences between both technologies.
SanDisk claims that the Mean Time Between Failures (MTBF) for HDDs is one million hours, whereas the MTBF for SSDs is 2.1 million hours³.
Sequential Read speeds on HDDs can be very good and are satisfying for most use cases. The difference between SSDs and HDDs becomes more apparent if you compare Random Read speeds. HDDs struggle in the category to read >2 MB/s, where SSDs can read >150 MB/s easily already⁴.
Understanding How SSDs Work
It is critical to understand that not all Flash storage is equal to chose the right SSD that fits your needs.
Basically, there are four different types of SSDs currently in the market: SLC, eMLC, MLC and TLC.
In general, you can say: The more bits you store in a cell, the cheaper the storage gets. But storing more bits in a cell also decreases writing speeds and wears out the SSD quicker.
The endurance of SSDs is measured in Program Erase (P/E) Cycles.
Writing data onto flash storage is called Program State. Data is typically written on a Page Level of 8KB - 16KB. It can be read at the same scale. Erasing happens on the Block Level (4 MB - 8 MB) though.
This is why it gets more interesting when you update already written data. Because you have to erase a bigger chunk of data just to write a small new one, even if you don’t want to update the other pages in a block. This is why data is constantly moved around into free space and you have to account for way more P/E Cycles than you are aware of. The P/E Cycles define the lifetime of the SSD because programming and erasing happen simultaneously, which physically damages the oxide layer.
The software-based Flash Transition Layer manages the logistics of writing data to the SSD. It has three basic functions:
Write updated information to a new page and reroute all read- requests to its new address.
Make sure to minimise damage by distributing the read and write operations in a favourable way. This mean of damage optimisation is called Wear Levelling.
Erasing an entire block can be inefficient, single and multiple pages in a block can be declared invalid without writing to the cell. The Flash Transition Layer keeps a list of those invalid pages to recycle them later for reuse.
In case of damage to the SSD, the Flash Transition Layer also provides Error Detection and Error Correction to guarantee reliable data delivery and Bad Block Re-Mapping to skip damaged blocks in the future.
Because you always need free storage to write to when erasing blocks, all SSDs have approximately 10% more storage, which you can’t access, to prevent your SSD to become a read-only device. This buffer is called Over Provisioning.
For this reason, manufacturers compare life expectancies in Petabytes Written (PBW); but make sure to compare PBW/capacity, because bigger drives inherently have bigger PBW values. In consumer SSDs you will encounter Terabytes Written (TBW).
If you wanna make sure an SSD is unlikely to fail under your specific load during warranty, consider the Drive Writes Per Day (DWPD), which is the number of full SSD P/E Cycles for every day of the warranty.
Knowing those principals of Flash Storage and your requirements, you understand how speed and endurance of SSDs are generated and you can tell which technology fits your needs the best.
In our scenario, the read speeds on an eMLC and its reliability are the best compromise to accommodate our needs. The extra price for SLCs would most likely not have paid back.
We chose to not reference exact prices in this article because prices are constantly diverging. The point of this was not to find you the cheapest HDD or SSD but to show you what to look for when you are looking for new storage.
This article only considers SATA- based storage options for servers.