User Tools

Site Tools


storage:v7k_replace_disk

IBM V3700-V5000-v7000 disk replacement

Procedure 1

I reviewed them and confirm that drive ID 6 slot 2 in enclosure 1 is offline and needs to be reseated.

Please follow these steps to reseat the drive and capture a drive dump:

  • Mark the drive that requires a reseat as failed (even if this drive is already in 'failed' state) via chdrive -use failed 6 (NOTE: check that there is sufficient redundancy in the array before doing this).
  • Mark the failed drive as unused via the GUI or via chdrive -use unused 6. This logically removes the drive from the array and means it will not automatically add back in after being reseated.
  • Re-seat the drive, or perform a virtual power cycle of the drive slot by running the command chenclosureslot -slot 2 -power cycle 1. Wait until the drive is online (can take up to 5 minutes)
  • If the drive comes back as 1674 “Drive offline because it is locked by a missing encryption key”, secure erase this drive via GUI or CLI chdrive -task erase 6. Wait until the drive is online (can take up to 5 minutes)
  • Trigger at least two drive dump after the power cycle with triggerdrivedump 6. This will ensure that older data are de-staged from the FCM internal memory and newer data could follow in the next drivedump.
  • Upload a SNAP option 1 (this will also include the drivedump) for IBM support review. Wait until we completed the drivedump analysis before integrating the drive back in the array.

Procedure 2

From the V7000's command line:

1- Login to the Management node as “root” user

2- Login to the V7000's command line

  ssh superuser@<IP> 

Now you will be executing steps 1 through 6 of the instructions above

3- Identify and list the disk to replace. ID=54

3.1 - Identify the drive physically in the enclosure, it will lit the LED so you can see exactly which one to pull

IBM_Storwize:V7_00_1:superuser> chenclosureslot -identify yes -slot 17 3

3.2 List the drive

IBM_Storwize:V7_00_1:superuser> svcinfo lsdrive 54
    id 54
    status offline
    error_sequence_number 507
    use failed
    UID 5000cca01638b2e0
    tech_type sas_hdd
    capacity 837.9GB
    block_size 512
    vendor_id IBM-207x
    product_id HUC109090CSS60
    FRU_part_number 00Y2684
    FRU_identity 11S49Y7449YXXXKPH05N6F
    RPM 10000
    firmware_level J2E6
    FPGA_level
    mdisk_id
    mdisk_name
    member_id
    enclosure_id 3
    slot_id 17
    node_id
    node_name
    quorum_id
    port_1_status offline
    port_2_status offline

3.3 Checking for any dependency the disk may have is not in the step, but is good to do

IBM_Storwize:V7_00_1:superuser> lsdependentvdisks -drive 54

Note: Notice the disk is identified by the id, enclosure and slot.

4- Take the drive offline (in this example the drive is already offline but will still show the command)

IBM_Storwize:V7_00_1:superuser> chdrive -use failed 54

4.1 List the drive again to verify that the drive is offline (repeat #3)

5- Mark the drive unused

IBM_Storwize:V7_00_1:superuser> chdrive -use unused 54

6- Now you can replace the disk. Pull the the old disk and wait between 2 to 5 minutes, then insert the new one

  Note: Contact the SSR to verify that they are looking at the correct V7000 and can see the drive with the LED on. The SSR will ask for the MTM and Serial Number of the enclosure and the rack location.
  See the following URL for instructions for physically replacing a drive:
  V7000 Gen1
  https://www.ibm.com/support/knowledgecenter/en/ST3FR7_7.6.0/com.ibm.storwize.v7000.760.doc/tbrd_rmv24carrier_1948dx.html
  V7000 Gen2:
  https://www.ibm.com/support/knowledgecenter/en/ST3FR7_7.6.0/com.ibm.storwize.v7000.760.doc/fab1_rplc_25_drv_assembly.html

7- The drive will come back as online / unused.

IBM_Storwize:V7_00_1:superuser> svcinfo lsdrive

8- If IBM PureData for Operational Analytics v1.1 mark the disk as “spare”.

IBM_Storwize:V7_00_1:superuser> chdrive -use spare 54

If you are in v1.0 environment, mar the disk as a candidate. Then the disk will become a member and the array will begun to use it.

IBM_Storwize:V7_00_1:superuser> chdrive -use candidate 54

After this step the array begun to use the disk and it will go into rebuild. If that does not happen, then you set it to spare. See command above.

You can check the progress by running the following command:

IBM_Storwize:V7_00_1:superuser> lsarraymemberprogress

This report the task “rebuild”, the array name, the estimated time for completion and others.

storage/v7k_replace_disk.txt · Last modified: 2024/08/26 22:03 by manu