Disk Performance Testing in Linux

Testing the performance of a disk under Windows is easy, you download and install CrystalDiskMark and hit “All”. Of course it can get more complicated than that but if what you want is a quick and fairly accurate measurement it’s got you covered. Under Linux things aren’t quite as simple. Please don’t think I’m saying the tools that are available aren’t great, they are, the issue is that they are far far more complex to use. The main tool you’ll want to use is “fio” and it has a literal mountain of options to tweak. It’s flexibility means that it can test just about any disk setup you care to think of and it even comes with it’s own job scheduler so it can be somewhat automated. The downside is it looks nearly impenetrable to a new user. What I have included below is what I think is a reasonable series of basic tests for a drive or array.

As I’m allergic to work I went straight for creating a job for fio, this saves a lot of messing about changing command line parameters. For the most part I’ll leave you to look up what the arguments mean, they are well explained. There’s a couple that are quite important though depending on what you are testing. The blocksize should match the size of the blocks on your disk if you want to find out how fast your disk can actually go. Direct should always be on or you’re going to also be measuring any operating system disk cache you might have. Generally you’ll want to set a runtime and make the test time based. It’s hard to predict how long a test will take so without this a seemingly small test can sometimes end up taking hours, from the experimenting I’ve done 60 seconds seems to be fine. The stonewall entry at the top of each job just tells fio to run the jobs separately.

The most important parameter though is “size”, this must be at least twice the size of the installed memory on the machine or you risk getting skewed results especially if you are testing ZFS. I started testing a single drive ZFS pool with a 4GB file and was getting ridiculously high read speeds. Turned out the drive was caching the whole file and just giving it back to me. When I bumped it up to 100GB it couldn’t cache it and I got sensible numbers.

Filename tells fio where to place the test file. It can be a bare drive (I’ve not tested this) or it can be a location on a filesystem. Here I’ve given it a relative path so the disk under test is whichever is mounted in the directory where I run the command. The job file, “test.fio”, I typically just keep in the home directory of the current user so running the test is simply a matter of switching to the correct directory and running “fio ~/test.fio”.

[global]
randrepeat=1  
ioengine=libaio
# Use non-buffered IO
direct=1
filename=test.dat
blocksize=4k
# Size of the test file. This needs to be larger than the machines ram.
size=100G
# Limit the total runtime and make the test time based
runtime=60
time_based
ramp_time=4

[randwrite]
stonewall
name=randwrite
readwrite=randwrite

[randread]
stonewall
name=randrepeat
readwrite=randread

[readwrite]
stonewall
name=readwrite
readwrite=readwrite

[seqwrite]
stonewall
name=sewwrite
readwrite=write

[seqread]
stonewall
name=seqread
readwrite=read
iodepth=10

After running the above jobs on a single 16TB Seagate Exos X16 the following mountain of output is produced:

randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
randrepeat: (g=1): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
readwrite: (g=2): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
sewwrite: (g=3): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
seqread: (g=4): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=10
fio-3.25
Starting 5 processes
Jobs: 1 (f=1): [_(4),R(1)][66.3%][r=232MiB/s][r=59.5k IOPS][eta 02m:43s]                          
randwrite: (groupid=0, jobs=1): err= 0: pid=1263729: Wed May 10 19:58:28 2023
  write: IOPS=200, BW=804KiB/s (823kB/s)(47.1MiB/60010msec); 0 zone resets
    slat (usec): min=5, max=363136, avg=4966.57, stdev=12260.57
    clat (nsec): min=230, max=23203, avg=4418.77, stdev=2927.67
     lat (usec): min=5, max=363145, avg=4971.37, stdev=12261.72
    clat percentiles (nsec):
     |  1.00th=[  350],  5.00th=[  470], 10.00th=[  620], 20.00th=[  988],
     | 30.00th=[ 1256], 40.00th=[ 4448], 50.00th=[ 4960], 60.00th=[ 5536],
     | 70.00th=[ 7264], 80.00th=[ 7456], 90.00th=[ 7584], 95.00th=[ 7712],
     | 99.00th=[ 7968], 99.50th=[ 8032], 99.90th=[12992], 99.95th=[16512],
     | 99.99th=[21888]
   bw (  KiB/s): min=   56, max= 2981, per=100.00%, avg=804.60, stdev=671.01, samples=120
   iops        : min=   14, max=  745, avg=201.01, stdev=167.71, samples=120
  lat (nsec)   : 250=0.02%, 500=6.34%, 750=11.33%, 1000=3.08%
  lat (usec)   : 2=16.34%, 4=0.74%, 10=61.98%, 20=0.15%, 50=0.03%
  cpu          : usr=0.28%, sys=2.03%, ctx=10810, majf=0, minf=58
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,12059,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
randrepeat: (groupid=1, jobs=1): err= 0: pid=1269654: Wed May 10 19:58:28 2023
  read: IOPS=144, BW=577KiB/s (590kB/s)(33.8MiB/60003msec)
    slat (usec): min=5, max=83132, avg=6925.13, stdev=4372.27
    clat (nsec): min=291, max=29475, avg=5839.99, stdev=2377.06
     lat (usec): min=6, max=83140, avg=6932.06, stdev=4373.81
    clat percentiles (nsec):
     |  1.00th=[  430],  5.00th=[  668], 10.00th=[ 1192], 20.00th=[ 4704],
     | 30.00th=[ 6304], 40.00th=[ 6880], 50.00th=[ 7008], 60.00th=[ 7072],
     | 70.00th=[ 7200], 80.00th=[ 7264], 90.00th=[ 7392], 95.00th=[ 7520],
     | 99.00th=[ 7712], 99.50th=[ 7840], 99.90th=[19328], 99.95th=[23168],
     | 99.99th=[29568]
   bw (  KiB/s): min=  440, max=  688, per=100.00%, avg=577.02, stdev=47.46, samples=120
   iops        : min=  110, max=  172, avg=144.13, stdev=11.85, samples=120
  lat (nsec)   : 500=1.53%, 750=7.55%, 1000=0.27%
  lat (usec)   : 2=5.40%, 4=2.98%, 10=82.07%, 20=0.13%, 50=0.08%
  cpu          : usr=0.28%, sys=1.67%, ctx=7613, majf=0, minf=58
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=8648,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
readwrite: (groupid=2, jobs=1): err= 0: pid=1270333: Wed May 10 19:58:28 2023
  read: IOPS=38.3k, BW=150MiB/s (157MB/s)(8977MiB/60001msec)
    slat (nsec): min=1001, max=303826k, avg=16360.00, stdev=1249440.84
    clat (nsec): min=160, max=208472, avg=218.44, stdev=206.46
     lat (nsec): min=1202, max=303835k, avg=16616.26, stdev=1249549.46
    clat percentiles (nsec):
     |  1.00th=[  171],  5.00th=[  181], 10.00th=[  181], 20.00th=[  181],
     | 30.00th=[  191], 40.00th=[  191], 50.00th=[  191], 60.00th=[  201],
     | 70.00th=[  211], 80.00th=[  262], 90.00th=[  290], 95.00th=[  310],
     | 99.00th=[  410], 99.50th=[  470], 99.90th=[ 1304], 99.95th=[ 1800],
     | 99.99th=[ 5408]
   bw (  KiB/s): min=21504, max=433192, per=100.00%, avg=153248.96, stdev=56801.82, samples=120
   iops        : min= 5376, max=108298, avg=38312.18, stdev=14200.46, samples=120
  write: IOPS=38.3k, BW=149MiB/s (157MB/s)(8969MiB/60001msec); 0 zone resets
    slat (usec): min=2, max=258708, avg= 8.83, stdev=193.97
    clat (nsec): min=170, max=180029, avg=234.04, stdev=151.79
     lat (usec): min=2, max=258717, avg= 9.06, stdev=174.77
    clat percentiles (nsec):
     |  1.00th=[  191],  5.00th=[  191], 10.00th=[  201], 20.00th=[  201],
     | 30.00th=[  201], 40.00th=[  211], 50.00th=[  211], 60.00th=[  221],
     | 70.00th=[  221], 80.00th=[  251], 90.00th=[  322], 95.00th=[  330],
     | 99.00th=[  382], 99.50th=[  470], 99.90th=[ 1020], 99.95th=[ 1480],
     | 99.99th=[ 3280]
   bw (  KiB/s): min=21592, max=431568, per=100.00%, avg=153122.64, stdev=56680.62, samples=120
   iops        : min= 5398, max=107892, avg=38280.61, stdev=14170.14, samples=120
  lat (nsec)   : 250=79.14%, 500=20.47%, 750=0.21%, 1000=0.05%
  lat (usec)   : 2=0.10%, 4=0.02%, 10=0.01%, 20=0.01%, 50=0.01%
  lat (usec)   : 100=0.01%, 250=0.01%
  cpu          : usr=3.75%, sys=20.93%, ctx=136745, majf=0, minf=59
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=2298052,2296153,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
sewwrite: (groupid=3, jobs=1): err= 0: pid=1288259: Wed May 10 19:58:28 2023
  write: IOPS=87.5k, BW=342MiB/s (359MB/s)(20.0GiB/60001msec); 0 zone resets
    slat (usec): min=2, max=293520, avg=10.94, stdev=511.58
    clat (nsec): min=160, max=477168, avg=225.33, stdev=271.78
     lat (usec): min=2, max=293530, avg=11.21, stdev=511.61
    clat percentiles (nsec):
     |  1.00th=[  191],  5.00th=[  191], 10.00th=[  191], 20.00th=[  201],
     | 30.00th=[  201], 40.00th=[  201], 50.00th=[  211], 60.00th=[  211],
     | 70.00th=[  211], 80.00th=[  221], 90.00th=[  290], 95.00th=[  310],
     | 99.00th=[  382], 99.50th=[  462], 99.90th=[ 1004], 99.95th=[ 1832],
     | 99.99th=[ 7200]
   bw (  KiB/s): min=29755, max=1214888, per=100.00%, avg=350335.03, stdev=314026.47, samples=120
   iops        : min= 7438, max=303722, avg=87583.66, stdev=78506.66, samples=120
  lat (nsec)   : 250=84.42%, 500=15.19%, 750=0.24%, 1000=0.04%
  lat (usec)   : 2=0.06%, 4=0.01%, 10=0.04%, 20=0.01%, 50=0.01%
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.01%
  cpu          : usr=4.01%, sys=27.51%, ctx=71705, majf=0, minf=58
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,5252981,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
seqread: (groupid=4, jobs=1): err= 0: pid=1305652: Wed May 10 19:58:28 2023
  read: IOPS=118k, BW=462MiB/s (484MB/s)(27.0GiB/60001msec)
    slat (nsec): min=1112, max=270080k, avg=7988.04, stdev=411894.93
    clat (nsec): min=1012, max=270143k, avg=76404.40, stdev=1235291.12
     lat (usec): min=2, max=270145, avg=84.43, stdev=1301.98
    clat percentiles (usec):
     |  1.00th=[   15],  5.00th=[   15], 10.00th=[   15], 20.00th=[   15],
     | 30.00th=[   15], 40.00th=[   15], 50.00th=[   19], 60.00th=[   22],
     | 70.00th=[   26], 80.00th=[   32], 90.00th=[   45], 95.00th=[   51],
     | 99.00th=[ 2933], 99.50th=[ 3392], 99.90th=[ 4178], 99.95th=[ 4490],
     | 99.99th=[10683]
   bw (  KiB/s): min=153088, max=1861296, per=100.00%, avg=472797.26, stdev=550376.68, samples=120
   iops        : min=38272, max=465324, avg=118199.23, stdev=137594.18, samples=120
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=50.42%, 50=43.80%
  lat (usec)   : 100=4.24%, 250=0.05%, 500=0.02%, 750=0.02%, 1000=0.02%
  lat (msec)   : 2=0.07%, 4=1.23%, 10=0.13%, 20=0.01%, 50=0.01%
  lat (msec)   : 100=0.01%, 250=0.01%, 500=0.01%
  cpu          : usr=5.22%, sys=25.92%, ctx=12033, majf=0, minf=58
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=7089348,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=10

Run status group 0 (all jobs):
  WRITE: bw=804KiB/s (823kB/s), 804KiB/s-804KiB/s (823kB/s-823kB/s), io=47.1MiB (49.4MB), run=60010-60010msec

Run status group 1 (all jobs):
   READ: bw=577KiB/s (590kB/s), 577KiB/s-577KiB/s (590kB/s-590kB/s), io=33.8MiB (35.4MB), run=60003-60003msec

Run status group 2 (all jobs):
   READ: bw=150MiB/s (157MB/s), 150MiB/s-150MiB/s (157MB/s-157MB/s), io=8977MiB (9413MB), run=60001-60001msec
  WRITE: bw=149MiB/s (157MB/s), 149MiB/s-149MiB/s (157MB/s-157MB/s), io=8969MiB (9405MB), run=60001-60001msec

Run status group 3 (all jobs):
  WRITE: bw=342MiB/s (359MB/s), 342MiB/s-342MiB/s (359MB/s-359MB/s), io=20.0GiB (21.5GB), run=60001-60001msec

Run status group 4 (all jobs):
   READ: bw=462MiB/s (484MB/s), 462MiB/s-462MiB/s (484MB/s-484MB/s), io=27.0GiB (29.0GB), run=60001-60001msec

The interesting figures are summarised at the bottom and show the speeds fio measured for the different tests. You can see, for example, that I achieved less than 1MiB/s for random reads (group 1) and writes (group 0) out of the 100GB file. Sequential reads (group 4) and writes (group 3) are much faster though, in fact they are too fast so I suspect earlier tests might have filled the cache – the best sequential read this drive should be able to perform is about 230MiB/s.

Once I’d created a three disk RAIDZ1 array of 16TB Seagate Exos X16 disks I achieved the results below. I’ve only shown the summaries here to keep the page length manageable, the job is the same. Considering the size of these figures I’ve guessing there must be some caching going on even though I’m using a 100GB test file. For example, there’s no way I’m getting 1700MiB/s of read from a small array of spinning rust. If these figures are real I’m super happy with this array.

Run status group 0 (all jobs):  --Random Writes
  WRITE: bw=68.5MiB/s (71.8MB/s), 68.5MiB/s-68.5MiB/s (71.8MB/s-71.8MB/s), io=4108MiB (4307MB), run=60001-60001msec

Run status group 1 (all jobs):  -- Random Reads
   READ: bw=236MiB/s (247MB/s), 236MiB/s-236MiB/s (247MB/s-247MB/s), io=13.8GiB (14.8GB), run=60001-60001msec

Run status group 2 (all jobs):  -- Reads and Writes
   READ: bw=705MiB/s (740MB/s), 705MiB/s-705MiB/s (740MB/s-740MB/s), io=41.3GiB (44.4GB), run=60001-60001msec
  WRITE: bw=706MiB/s (740MB/s), 706MiB/s-706MiB/s (740MB/s-740MB/s), io=41.4GiB (44.4GB), run=60001-60001msec

Run status group 3 (all jobs):  -- Sequential Writes
  WRITE: bw=1061MiB/s (1112MB/s), 1061MiB/s-1061MiB/s (1112MB/s-1112MB/s), io=62.2GiB (66.7GB), run=60001-60001msec

Run status group 4 (all jobs):  -- Sequential Reads
   READ: bw=1709MiB/s (1792MB/s), 1709MiB/s-1709MiB/s (1792MB/s-1792MB/s), io=100GiB (108GB), run=60001-60001msec