This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
storage:v7k_replace_disk [2021/12/11 12:26] manu created |
storage:v7k_replace_disk [2024/08/26 22:03] (current) manu |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== IBM V3700-V5000-v7000 disk replacement ====== | ====== IBM V3700-V5000-v7000 disk replacement ====== | ||
+ | |||
+ | ===== Procedure 1 ===== | ||
+ | |||
+ | I reviewed them and confirm that drive ID 6 slot 2 in enclosure 1 is offline and needs to be reseated. | ||
+ | |||
+ | Please follow these steps to reseat the drive and capture a drive dump: | ||
+ | |||
+ | * Mark the drive that requires a reseat as failed (even if this drive is already in 'failed' state) via **chdrive -use failed 6** (NOTE: check that there is sufficient redundancy in the array before doing this). | ||
+ | | ||
+ | * Mark the failed drive as unused via the GUI or via **chdrive -use unused 6**. This logically removes the drive from the array and means it will not automatically add back in after being reseated. | ||
+ | | ||
+ | * Re-seat the drive, or perform a virtual power cycle of the drive slot by running the command **chenclosureslot -slot 2 -power cycle 1**. Wait until the drive is online (can take up to 5 minutes) | ||
+ | | ||
+ | * If the drive comes back as 1674 "Drive offline because it is locked by a missing encryption key", secure erase this drive via GUI or CLI **chdrive -task erase 6**. Wait until the drive is online (can take up to 5 minutes) | ||
+ | | ||
+ | * Trigger at least two drive dump after the power cycle with **triggerdrivedump 6**. This will ensure that older data are de-staged from the FCM internal memory and newer data could follow in the next drivedump. | ||
+ | | ||
+ | * Upload a SNAP option 1 (this will also include the drivedump) for IBM support review. Wait until we completed the drivedump analysis before integrating the drive back in the array. | ||
+ | | ||
+ | ===== Procedure 2 ===== | ||
From the V7000's command line: | From the V7000's command line: | ||
Line 17: | Line 37: | ||
3.1 - Identify the drive physically in the enclosure, it will lit the LED so you can see exactly which one to pull | 3.1 - Identify the drive physically in the enclosure, it will lit the LED so you can see exactly which one to pull | ||
- | + | <cli prompt='>'> | |
IBM_Storwize:V7_00_1:superuser> chenclosureslot -identify yes -slot 17 3 | IBM_Storwize:V7_00_1:superuser> chenclosureslot -identify yes -slot 17 3 | ||
+ | </cli> | ||
3.2 List the drive | 3.2 List the drive | ||
+ | <cli prompt='>'> | ||
IBM_Storwize:V7_00_1:superuser> svcinfo lsdrive 54 | IBM_Storwize:V7_00_1:superuser> svcinfo lsdrive 54 | ||
- | |||
- | |||
id 54 | id 54 | ||
status offline | status offline | ||
Line 53: | Line 71: | ||
port_1_status offline | port_1_status offline | ||
port_2_status offline | port_2_status offline | ||
+ | </cli> | ||
+ | 3.3 Checking for any dependency the disk may have is not in the step, but is good to do | ||
- | + | <cli prompt='>'> | |
- | + | ||
- | 3.3 Checking for any dependency the disk may have is not in the step, but is good to do | + | |
IBM_Storwize:V7_00_1:superuser> lsdependentvdisks -drive 54 | IBM_Storwize:V7_00_1:superuser> lsdependentvdisks -drive 54 | ||
+ | </cli> | ||
Note: Notice the disk is identified by the id, enclosure and slot. | Note: Notice the disk is identified by the id, enclosure and slot. | ||
4- Take the drive offline (in this example the drive is already offline but will still show the command) | 4- Take the drive offline (in this example the drive is already offline but will still show the command) | ||
- | IBM_Storwize:V7_00_1:superuser> chdrive -use failed 54 | + | <cli prompt='>'> |
+ | IBM_Storwize:V7_00_1:superuser> chdrive -use failed 54 | ||
+ | </cli> | ||
4.1 List the drive again to verify that the drive is offline (repeat #3) | 4.1 List the drive again to verify that the drive is offline (repeat #3) | ||
Line 70: | Line 90: | ||
5- Mark the drive unused | 5- Mark the drive unused | ||
- | IBM_Storwize:V7_00_1:superuser> chdrive -use unused 54 | + | <cli prompt='>'> |
- | + | IBM_Storwize:V7_00_1:superuser> chdrive -use unused 54 | |
+ | </cli> | ||
6- Now you can replace the disk. Pull the the old disk and wait between 2 to 5 minutes, then insert the new one | 6- Now you can replace the disk. Pull the the old disk and wait between 2 to 5 minutes, then insert the new one | ||
Line 92: | Line 112: | ||
7- The drive will come back as online / unused. | 7- The drive will come back as online / unused. | ||
- | IBM_Storwize:V7_00_1:superuser> svcinfo lsdrive | + | <cli prompt='>'> |
+ | IBM_Storwize:V7_00_1:superuser> svcinfo lsdrive | ||
+ | </cli> | ||
8- If IBM PureData for Operational Analytics v1.1 mark the disk as "spare". | 8- If IBM PureData for Operational Analytics v1.1 mark the disk as "spare". | ||
+ | <cli prompt='>'> | ||
IBM_Storwize:V7_00_1:superuser> chdrive -use spare 54 | IBM_Storwize:V7_00_1:superuser> chdrive -use spare 54 | ||
+ | </cli> | ||
If you are in v1.0 environment, mar the disk as a candidate. Then the disk will become a member and the array will begun to use it. | If you are in v1.0 environment, mar the disk as a candidate. Then the disk will become a member and the array will begun to use it. | ||
+ | <cli prompt='>'> | ||
IBM_Storwize:V7_00_1:superuser> chdrive -use candidate 54 | IBM_Storwize:V7_00_1:superuser> chdrive -use candidate 54 | ||
+ | </cli> | ||
After this step the array begun to use the disk and it will go into rebuild. If that does not happen, then you set it to spare. See command above. | After this step the array begun to use the disk and it will go into rebuild. If that does not happen, then you set it to spare. See command above. | ||
You can check the progress by running the following command: | You can check the progress by running the following command: | ||
+ | <cli prompt='>'> | ||
IBM_Storwize:V7_00_1:superuser> lsarraymemberprogress | IBM_Storwize:V7_00_1:superuser> lsarraymemberprogress | ||
+ | </cli> | ||
This report the task “rebuild”, the array name, the estimated time for completion and others. | This report the task “rebuild”, the array name, the estimated time for completion and others. |