Storage performance shapes application response time and user experience. Two numbers show up in every sizing conversation – IOPS and throughput, but they answer different questions. Here’s a clear, practical take on both and how they relate to latency.
What Are IOPS and Throughput? Definitions
IOPS (Input/Output Operations per Second) – is a measure of the number of discrete input (write) and output (read) operations a storage device completes in one second. It is considered a standard metric for all storage types, including local devices such as DAS (SATA/SAS/NVMe) and remote protocols (iSCSI, FCP, and NVM/TCP/RDMA).
IOPS doesn’t measure the amount of data transferred, but rather the quantity of operations that can be performed in a single second. IOPS is critical for environments with high random access requirements, such as virtualization and databases. A higher IOPS generally indicates faster storage media, faster response times, and greater parallel requests capability.
Throughput (data transfer rate) – is a measure of the total volume of data transferred in a given unit of time (usually measured in megabytes/mebibytes or gigabytes/gibibytes per second).
If IOPS measures how many operations can be done within a second, throughput measures how much ‘real data’ can be transferred within a second. Throughput is critical for workloads with data-transfer-intensive applications like multimedia or big data.
Note: When talking about units, it is worth mentioning that different OSes and vendors use different units to indicate throughput. Megabytes per second (MB/s) is a decimal measurement that is usually used by Microsoft Windows and storage disk vendors, as a public-facing figure that is easier to understand. However, under the hood, Microsoft uses binary calculations and measures, but displays everything in decimal units. That’s why there is some confusion when you see MB/s and MiB/s. Ultimately, storage is measured in bytes per second, so values are binary by nature, meaning that MiB/s (mebibytes per second) is the correct measurement unit.
Also, the term “throughput” is used when benchmarking or indicating network throughput. It is worth remembering that network throughput is measured in bits, while storage throughput is measured in bytes (8 bits = 1 byte). This means that network throughput will always be displayed in Gbps (gigabits per second), while storage throughput will sometimes be displayed as GB/s (gigabytes per second) or GiB/s (gibibytes per second).
IOPS vs Throughput vs Latency

To provide a complete picture of storage, we need to discuss latency. IOPS, throughput, and latency are not exclusive or isolated measurements; instead, they are tightly linked to each other, while the main goal is to provide different information for you.
Latency is the amount of time it takes for a storage device to process a single data request (I/O) from the moment of request to the moment the response is received. It basically determines the delay between initiating an operation and receiving the result, usually measured in milliseconds (ms) or microseconds (µs).
While IOPS shows how many operations can be performed in a second, throughput tells you how much actual data can be transferred at the same time, and latency tells us how long we wait for a storage operation to be performed.
As mentioned, these metrics are directly linked to each other. Latency is directly linked to IOPS, and throughput is linked to IOPS, meaning that changes in one metric will affect the others.
Let’s break down how they are linked:
Latency ultimately limits the number of I/O operations that can be performed. This is especially critical when storage is connected via a network or additional operations are done on the storage backend. As an example, if the potential maximum for a disk is 1000 IOPS at 1 ms latency, and the latency increases to 10 ms, the IOPS will drop to 100. So, if you have a storage device that, in theory, can achieve hundreds of thousands or even millions of IOPS on paper, unpredictable or high latency will limit performance.
To understand the relation between IOPS and throughput, we need to introduce another term: block size. Block size defines the size of each operation (read or write) performed. When we know the block size being used (for example, 4 KB) and we have an IOPS number (for example, 1000), we can calculate potential throughput by multiplying IOPS and block size. In our example, multiplying 1000 by 4000 gives 4,000,000 bytes/second, which converts to 4 MB/s.
However, some remarks need to be made to avoid potential confusion. While IOPS and throughput are linked and can be converted by multiplying the block size and IOPS, you also need to consider the storage device’s connection method (if it is connected via a network, the potential maximum performance might be limited by the network) and the block size itself.
While all 3 metrics (IOPS, throughput, and latency) are equally important, for a specific workload, one might have more weight than the others.
Why IOPS Matters
IOPS is a top metric for workloads where a high frequency of small I/O operations is dominant. In these high transactional workloads, the system’s ability to quickly locate and process many parallel requests is more critical than the total data transfer speed or volume. A system’s IOPS capability directly represents its responsiveness in random workload environments.
Typical environments that depend on IOPS more than throughput are databases (OLTP), general virtualization, and VDI.
Why Throughput Matters
Throughput is a top metric for workloads defined by large, sequential data transfers. These applications usually move large blocks of data, and performance is limited by the raw throughput of the storage subsystem and network.
Typical environments that rely heavily on raw throughput are video editing and streaming, big data analytics, and backups.
IOPS vs Throughput Comparison Table
With all we’ve learned about IOPS and throughput, let’s compare them face-to-face in the table below:
| Metric | Measures | Best For | Top Use Cases | Typical Unit | How it relates to Latency |
|---|---|---|---|---|---|
| IOPS | Number of read/write operations per second | Random small I/O apps | VDI, virtualization, OLTP | ops/sec (IOPS) | Lower latency enables higher IOPS |
| Throughput | Volume of data rate transferred per second | Sequential large transfers | Video editing, big data, backups | MiB/s or GiB/s | Higher throughput requires lower latency |
Factors Affecting IOPS and Throughput
Storage performance depends on various factors beyond pure IOPS or throughput measurements. Let’s discuss some of these factors:
- Physical storage media used: depending on which media is used, some types can provide more IOPS or better throughput. HDDs, while mechanically limited in randomly seeking and reading data, can still provide decent sequential throughput performance. Flash storage (SSD/NVMe) is much superior in terms of random read and write access and at the same time provides great throughput. NVMe drives have even lower latency, which directly improves IOPS and throughput.
- Storage protocol and network throughput: if storage is connected via a network, the selected storage protocol and overall network throughput will affect performance. For example, iSCSI, while easy to implement, may limit performance due to the fact that it is a TCP/IP protocol and will involve the host CPU for additional storage operations. Also, if you are connecting storage over a 1 Gbps network, the potential maximum throughput you can get is about 100 MiB/s. Additional latency over the network will affect the overall storage performance.
- RAID configuration: RAID, in general, significantly affects IOPS and throughput. While RAID-0 provides the best possible performance, it doesn’t really provide any redundancy. RAID-5/6, while providing redundancy, suffers from parity penalties on write operations, though read operations are not affected.
- Block size: block size is the amount of data transferred in a single I/O operation. Applications use different block sizes based on their nature. For example, databases usually operate on smaller blocks (like 4–8 KB) and require storage like SSDs, while backups use larger blocks (like 128 KB–1 MB) and require storage that can handle intensive sequential loads.
- Queue depth (QD): QD is a critical tuning parameter that measures the number of outstanding I/O requests a storage device can handle. In short, it represents the number of parallel operations performed. By increasing queue depth, you may increase IOPS at the cost of increased latency.
- Workload: a workload and storage mismatch will greatly affect the storage subsystem’s ability to deliver the required performance for the application. If the application requires high random read/write access, don’t place it on HDD-based storage – invest in all-flash solutions.
How to Measure IOPS and Throughput
There are many ways to measure IOPS, but we will focus on the main approach.
Use specialized benchmark tools for storage to generate/simulate a synthetic workload and measure performance in controlled conditions. Keep in mind that synthetic workloads and results are aimed to represent the potential of your storage subsystem, but real-world performance may differ.
- FIO (Flexible I/O tester): a very popular storage benchmark tool available for both Linux and Windows. As the name suggests, it is very flexible in terms of configurations and allows you to specify almost every aspect of the storage requests to generate accurate synthetic benchmark results.
- Vdbench: another popular and flexible storage benchmark tool that is widely used for synthetic benchmarks.
- DiskSpd: similar to FIO, but Windows-native and initially developed by Microsoft. If you are a Windows shop, you may consider this tool your go-to benchmark utility.
You can also measure both IOPS and throughput using integrated monitoring tools and by converting IOPS to throughput manually.
How to Optimize IOPS and Throughput
Performance optimization is a hard task that usually involves verification of each level, from the very bottom (physical) to the top (logical), and finding bottlenecks. Finding those bottlenecks may require dozens or even hundreds of hours, by decomposing the entire stack, verifying every layer, and optimizing it to the required level.
- Physical layer: ensure that the storage media you are using is capable of delivering the required random read/write access or throughput-heavy workload. With the price for NVMe continuing to drop over time, consider it as the ultimate answer to most performance needs.
- RAID: usually involves a very unpleasant trade-off. You either have better storage efficiency but lower performance (like RAID-5), or better performance but lower storage efficiency (RAID-10). Decide which one is more important to you and use the respective RAID level.
- If you are using modern RAID controllers and wish to use NVMe drives, you can try to use them in hardware RAID. However, keep in mind that overall performance will be limited by the number of PCIe lanes the storage controller has, so it might even be beneficial to switch to software RAID, trading CPU cycles and RAM for better performance.
- Caching: caching allows you to store either in memory (RAM/L1) or on disk (Flash/L2) a portion of the recently written or accessed data. By utilizing faster storage media than your underlying storage, you can increase IOPS, and depending on the cache mode, you can boost both reads and writes (Write-Back) or only reads (Write-Through).
- Automated storage tiering: if you have a mix of media, like HDD and SSD within a single storage subsystem, implementing automated storage tiering ensures that all new requests will hit the hot tier (SSD) first and then move to the cold tier (HDD) over time if the data is not frequently requested. Some solutions, like DataCore SANsymphony, allow the creation of multi-tier (>2 tiers) storage pools.
- Block size alignment: aligning the file system and hardware/software RAID block sizes with the block size that your application uses can significantly reduce overhead and improve performance.
- Networking: if you are using a network-connected storage, make sure that the network layer is not a bottleneck by running a benchmark with tools like iperf.
Use Cases: When to Prioritize Which Metric
As we mentioned above, all three metrics are equally important. However, the choice of which metric to prioritize is not about one of them being better than the other, but rather a reflection of specific workload requirements.
Databases and OLTP Systems
Databases and OLTP (Online Transactional Processing) workloads generate a high volume of small, random transactions. Such workloads depend heavily on the IOPS that the storage subsystem can generate. The more transactions your system is processing, the more IOPS you need to achieve optimal performance. It is recommended to use all-flash or all-NVMe systems for such workloads to achieve the best results.
Backup and Archival
Backup software (like Veeam or Commvault) uses large sequential data transfers to read data from the production environment and write the backup to your repository. This workload depends largely on the throughput your storage subsystem is capable of at a specific block size. It is beneficial to make sure your storage subsystem, whether it is all-HDD or all-flash, is optimized for such writes at the RAID and filesystem level.
Virtualized Environments
A generic virtualization workload involves random read/write access. In particular, virtualization environments like VDI benefit from higher IOPS numbers because of many users accessing data simultaneously. Higher IOPS also helps systems handle “boot storms.” All-flash or all-NVMe systems are recommended.
Media and Video Processing
Media and video processing involve large blocks of sequential files to stream or scrub. Such workloads benefit from higher throughput. All-NVMe storage with fast client access is recommended, especially if your organization is working with high-resolution formats such as 8K video editing and collaboration.
Conclusion
IOPS and throughput are not competing metrics but rather two crucial and interconnected components of storage performance. A successful storage planning requires a good understanding of every layer of storage and the I/O characteristics of the planned workloads, rather than focusing on a single, isolated performance number.
from StarWind Blog https://ift.tt/WC83F4Y
via IFTTT
No comments:
Post a Comment