Monday, August 18, 2008

How to recover a failed MPIO paths from an IBM VIO server on an AIX LPAR

If you have set up disks from 2 VIO servers using MPIO to an AIX LPAR, then you need to make some changes to your hdisks.

You must make sure the hcheck_interval and hcheck_mode are set correctly:

Example for default hdisk0 settings:

# lsattr -El hdisk0
PCM PCM/friend/vscsi Path Control Module False
algorithm fail_over Algorithm True
hcheck_cmd test_unit_rdy Health Check Command True
hcheck_interval 0 Health Check Interval True
hcheck_mode nonactive Health Check Mode True
max_transfer 0x40000 Maximum TRANSFER Size True
pvid 00cd1e7cb226343b0000000000000000 Physical volume identifier False
queue_depth 3 Queue DEPTH True
reserve_policy no_reserve Reserve Policy True


IBM recommends a value of 60 for check_interval and hcheck_mode should be set to "nonactive".

To change these values (if necessary):

# chdev -l hdisk0 -a hcheck_interval=60 -P

# chdev -l hdisk0 -a hcheck_mode=nonactive -P

Now, you would need to reboot for automatic path recovery to take effect.

If you did not set the check_interval and hcheck_mode as described above or did not reboot, then after a failed path, you would see the following even after the path is back online:

# lspath
Enabled hdisk0 vscsi0
Failed hdisk0 vscsi1


To fix this, you would need to execute the following commands:

# chpath -l hdisk0 -p vscsi1 -s disable

# chpath -l hdisk0 -p vscsi1 -s enable

Now, check the status again:

# lspath
Enabled hdisk0 vscsi0
Enabled hdisk0 vscsi1
Example:

chpath -l hdisk0 -p vscsi1 -s disable; chpath -l hdisk0 -p vscsi1 -s enable

3 comments:

Anonymous said...

Many Thanks! this helped me out of bind!

Anonymous said...

excellent post - very helpful.

Siva said...

Hi, The above settings were done in
VIO or in LPAR.