Intel NVMe drive Performance degradation with xfs

2019-06-11 14:09发布

问题:

I am working with NVMe card on linux(Ubuntu 14.04). I am finding some performance degradation for Intel NVMe card when formatted with xfs file system with its default sector size(512). or any other sector size less than 4096.

In the experiment I formatted the card with xfs filesystem with default options. I tried running fio with 64k block size on an arm64 platform with 64k page size. This is the command used fio --rw=randread --bs=64k --ioengine=libaio --iodepth=8 --direct=1 --group_reporting --name=Write_64k_1 --numjobs=1 --runtime=120 --filename=new --size=20G

I could get only the below values

Run status group 0 (all jobs): READ: io=20480MB, aggrb=281670KB/s, minb=281670KB/s, maxb=281670KB/s, mint=744454msec, maxt=74454msec Disk stats (read/write): nvme0n1: ios=326821/8, merge=0/0, ticks=582640/0, in_queue=582370, util=99.93%

I tried formatting as follows:

mkfs.xfs -f -s size=4096 /dev/nvme0n1

then the values were :

Run status group 0 (all jobs): READ: io=20480MB, aggrb=781149KB/s, minb=781149KB/s, maxb=781149KB/s, mint=266 847msec, maxt=26847msec Disk stats (read/write): nvme0n1: ios=326748/7, merge=0/0, ticks=200270/0, in_queue=200350, util=99.51%

I find no performance degradation when used with

  • 4k page size
  • Any fio block size lesser than 64k
  • With ext4 fs with default configs

What could be the issue? Is this any alignment issue? What Am I missing here? Any help appreciated

回答1:

The issue is your SSD's native sector size is 4K. So your file system's block size should be set to match so that reads and writes are aligned on sector boundaries. Otherwise you'll have blocks that span 2 sectors, and therefore require 2 sector reads to return 1 block (instead of 1 read).

If you have an Intel SSD, the newer ones have a variable sector size you can set using their Intel Solid State Drive DataCenter Tool. But honestly 4096 is still probably the drive's true sector size anyway and you'll get the most consistent performance using it and setting your file system to match.

On ZFS on Linux the setting is ashift=12 for 4K blocks.