You must make sure the hcheck_interval and hcheck_mode are set correctly:
Example for default hdisk0 settings:
# lsattr -El hdisk0
PCM PCM/friend/vscsi Path Control Module False
algorithm fail_over Algorithm True
hcheck_cmd test_unit_rdy Health Check Command True
hcheck_interval 0 Health Check Interval True
hcheck_mode nonactive Health Check Mode True
max_transfer 0x40000 Maximum TRANSFER Size True
pvid 00cd1e7cb226343b0000000000000000 Physical volume identifier False
queue_depth 3 Queue DEPTH True
reserve_policy no_reserve Reserve Policy True
IBM recommends a value of 60 for check_interval and hcheck_mode should be set to "nonactive".
To change these values (if necessary):
# chdev -l hdisk0 -a hcheck_interval=60 -P
# chdev -l hdisk0 -a hcheck_mode=nonactive -P
Now, you would need to reboot for automatic path recovery to take effect.
If you did not set the check_interval and hcheck_mode as described above or did not reboot, then after a failed path, you would see the following even after the path is back online:
# lspath
Enabled hdisk0 vscsi0
Failed hdisk0 vscsi1
To fix this, you would need to execute the following commands:
# chpath -l hdisk0 -p vscsi1 -s disable
# chpath -l hdisk0 -p vscsi1 -s enable
Now, check the status again:
Example:
# lspath
Enabled hdisk0 vscsi0
Enabled hdisk0 vscsi1
chpath -l hdisk0 -p vscsi1 -s disable; chpath -l hdisk0 -p vscsi1 -s enable
3 comments:
Many Thanks! this helped me out of bind!
excellent post - very helpful.
Hi, The above settings were done in
VIO or in LPAR.
Post a Comment