https://access.redhat.com/solutions/43861
What is the kernel parameters related to maximum size of physical I/O requests?
SOLUTION 已验证 - 已更新 2018年八月22日02:23 -
环境
- Red Hat Enterprise Linux 7
- Red Hat Enterprise Linux 6
- Red Hat Enterprise Linux 5
问题
- What is the kernel parameters related to maximum size of physical I/O requests?
- How do I adjust the maximum I/O transfer size on RHEL?
- How do I set the maximum block device I/O transfer size on RHEL?
- How to set custom values for
/sys/block/<device>/queue/max_sectors_kb
parameters? - Default
max_sectors_kb
setting - Do you have a proper way (reboot persistent, applies to current disks (devices), applies to disks (devices) that are added before next reboot) to change the default
max_sectors_kb
?
决议
Linux has the two values in the /sys
file system relating to this:
-
max_hw_sectors_kb
(read-only)- This is the maximum number of kilobytes supported in a single data transfer by the underlying device. This value is read-only — it is set by the driver to reflect the driver/hardware limit. The block layer will also enforce this limit and so it will take the minimum of the
max_hw_sectors_kb
and the kernel default block limit (512kb in RHEL4/5) to make sure all I/O requests are within the size limit that the hardware/driver can support.
- This is the maximum number of kilobytes supported in a single data transfer by the underlying device. This value is read-only — it is set by the driver to reflect the driver/hardware limit. The block layer will also enforce this limit and so it will take the minimum of the
-
max_sectors_kb
(read/write)- This is the maximum number of kilobytes that the block layer will allow for a filesystem request. This value could be overwritten, but it must be smaller than or equal to the maximum size allowed by the hardware. The default kernel value on RHEL4/5 is 512kb.
- To adjust or set the maximum transfer size, set the device's
max_sectors_kb
to the desired value. That value must be less than the kernel's maximum of 512kb. For example, say a storage vendor recommends limiting the I/O transfer size to their storage at or below 128kb per request. Then you would perform the following command for each such storage device:
$ echo 128 > /sys/block/sdz/queue/max_sectors_kb
Making max_sectors_kb
tuning persistent
Changes to /sys
entries do not persist across reboots.
An easy way of making a tuned max_sectors_kb
setting persistent is to place the tuning command(s) in /etc/rc.d/rc.local
. This will only work for storage devices that are present during system boot.
To make tuned max_sectors_kb
settings persistent for dynamically added storage devices, they can be incorporated in udev rules which are executed when the device is added. For example, to set a 128 KB limit for HP EVA storage, use a udev rule
ACTION=="add|change", SUBSYSTEM=="block", ENV{ID_VENDOR}=="HP", ENV{ID_MODEL}=="HSV*", RUN+="/bin/sh -c '/bin/echo 128 > /sys%p/queue/max_sectors_kb'"
Please note that the above filesystem requests can be achieved in RHEL native filesystems such as ext4
, xfs
etc. If the maximum number of kilobytes that the block layer will allow for a filesystem request can't be achieved even after max_sectors_kb
is changed for the device, then check if it is formatted with any other third party filesystem. If yes, then kindly contact the vendor of the filesystem for support.
Refer to What is udev and how do you write custom udev rules? for more background on udev.
诊断步骤
Example of normal SATA internal drive:
$ cat /sys/block/sda/queue/max_sectors_kb
512
$ cat /sys/block/sda/queue/max_hw_sectors_kb
32767
Example of a usb drive, which has a lower hardware limit than the block layer default:
$ cat /sys/block/sdb/queue/max_sectors_kb
120
$ cat /sys/block/sdb/queue/max_hw_sectors_kb
120
In both cases, the max_sectors_kb
has been set to the minimum of the block layer default and the hardware/driver value.
We can see examples of two different transfer sizes in action using simple dd
and iostat
commands.
$ echo 512 > /sys/block/sda/queue/max_sectors_kb
$ dd if=/dev/sda5 of=/dev/null iflag=direct bs=4M count=250 &
Output from iostat -tkx 1
shows the average request size (avgrq-sz) never exceeds 1024 sectors or 512kb as expected.
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda5 0.00 0.00 109.00 0.00 55808.00 0.00 1024.00 3.39 29.36 6.46 70.40
sda5 0.00 0.00 181.19 0.00 92768.32 0.00 1024.00 4.50 25.42 5.42 98.22
sda5 0.00 0.00 195.00 0.00 99840.00 0.00 1024.00 4.56 23.57 5.13 100.00
Note: the dd command uses the iflag=direct to force 4MB io into the kernel block/scheduler layer where that size is reduced to the maximum transfer size allowed by the target device. Because this single large io request is broken down into smaller sized ones, we see this reflected in the average queue size being greater than 1.
$ echo 128 > /sys/block/sda/queue/max_sectors_kb
$ dd if=/dev/sda5 of=/dev/null iflag=direct bs=4M count=250 &
Output from iostat -tkx 1
now shows the average request size (avgrq-sz) never exceeds 256 sectors or 128kb as expected.
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda5 0.00 0.00 169.00 0.00 21632.00 0.00 256.00 5.74 31.81 1.96 33.20
sda5 0.00 0.00 539.60 0.00 69069.31 0.00 256.00 16.49 30.58 1.83 98.61
sda5 0.00 0.00 587.00 0.00 75136.00 0.00 256.00 15.67 26.39 1.70 99.60
sda5 0.00 0.00 495.00 0.00 63360.00 0.00 256.00 16.25 33.66 2.03 100.40