====== Howto cleanup a PV MISSING in a VG, resync VG ======
If a VG has a PV missing (disk not accessible, or removed), then if the VG is mirrored, then you can easily access the datas on the good copy, else the filesystems from wrong disk will be lost:
First check the status of the disk:
[root@labotest]/root# lsvg -p vgExport
vgExport:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk1 active 511 95 00..00..00..00..95
hdisk2 missing 542 29 51..18..51..51..51
[root@labotest]/root# lsvg vgExport
VOLUME GROUP: vgExport VG IDENTIFIER: 00fa343c00004c000000016155668966
VG STATE: active PP SIZE: 64 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 1918 (122752 megabytes)
MAX LVs: 256 FREE PPs: 128 (8192 megabytes)
LVs: 18 USED PPs: 1790 (114560 megabytes)
OPEN LVs: 17 QUORUM: 1 (Disabled)
TOTAL PVs: 2 VG DESCRIPTORS: 3
STALE PVs: 0 STALE PPs: 56
First check your disk availility:
[root@labotest]/root# cfgmgr
[root@labotest]/root# lspath
If all disks and paths are OK then you can continue, else check disk connections.
===== rootvg =====
A rootvg can aonly be synchronized, if no dump device are in use
Check if it configure as **/dev/sysdumpnull**, else change it
[root@labotest]/root# sysdumpdev -l
primary /dev/lg_dumplv
secondary /dev/lg_dumplv2
...
[root@labotest]/root# sysdumpdev -Pp /dev/sysdumpnull -s /dev/sysdumpnull
Now change disk availability
[root@labotest]/root# lsvg -p rootvg
rootvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk0 active 511 95 00..00..00..00..95
hdisk3 missing 542 29 51..18..51..51..51
[root@labotest]/root# chpv -v a hdisk3
[root@labotest]/root# varyonvg rootvg
[root@labotest]/root# lsvg -p rootvg
rootvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk0 active 511 95 00..00..00..00..95
hdisk3 active 542 29 51..18..51..51..51
Replace dump devices
[root@labotest]/root# sysdumpdev -Pp /dev/lg_dumplv -s /dev/lg_dumplv2
Now synchronize rootvg
[root@labotest]/root# syncvg -v rootvg
After a while it should have **STALE PPs:0**
[root@labotest]/root# lsvg rootvg
VOLUME GROUP: rootvg VG IDENTIFIER: 00fa343c00004c000000016155668966
VG STATE: active PP SIZE: 64 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 1918 (122752 megabytes)
MAX LVs: 256 FREE PPs: 128 (8192 megabytes)
LVs: 18 USED PPs: 1790 (114560 megabytes)
OPEN LVs: 17 QUORUM: 1 (Disabled)
TOTAL PVs: 2 VG DESCRIPTORS: 3
STALE PVs: 0 STALE PPs: 0
===== non rootvg =====
Change disk availability
[root@labotest]/root# lsvg -p vgExport
vgExport:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk1 active 511 95 00..00..00..00..95
hdisk2 missing 542 29 51..18..51..51..51
[root@labotest]/root# chpv -v a hdisk1
[root@labotest]/root# varyonvg vgExport
[root@labotest]/root# lsvg -p vgExport
rootvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk1 active 511 95 00..00..00..00..95
hdisk2 active 542 29 51..18..51..51..51
Now synchronize rootvg
[root@labotest]/root# syncvg -v vgExport
===== non rootvg in concurrent mode (PowerHA) =====
Check disks concurrent mode
[root@cl_abotest1]/root# lspv
...
hdisk5 00fa343ca582d6cd caavg_private active
hdisk6 00fa343ca582ed54 RG1vg concurrent
hdisk7 00fa343ca582ee4b RG1vg concurrent
Change disk availability, varyonvg with option **-c** for concurrent VG
[root@cl_abotest1]/root# lsvg -p RG1vg
RG1vg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk6 active 511 95 00..00..00..00..95
hdisk7 missing 542 29 51..18..51..51..51
[root@cl_abotest1]/root# chpv -v a hdisk7
[root@cl_abotest1]/root# varyonvg -c RG1vg
[root@cl_abotest1]/root# lsvg -p RG1vg
RG1vg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk6 active 511 95 00..00..00..00..95
hdisk7 active 542 29 51..18..51..51..51
Now synchronize RG1vg
[root@labotest]/root# syncvg -v RG1vg
===== other procedure =====
Before editing ODM, which can be very dangerous, you can use low level command to correct the problem:
The single command we will be using to remove the disk is:
ldeletepv -v VGID -p PVID
But, before we do, there are a number of steps we should follow as a matter of "best practice".
CASE: While the volume group is offline, maintenance is performed on the disks. One disk is/was damaged beyond repair, or replaced during the process.
Now back at AIX the volumes are to be reactivated.
[root@labotest]/root# lsvg -p vgExport
0516-010 : Volume group must be varied on; use varyonvg command.
[root@labotest]/root# varyonvg vgExport
PV Status: hdisk1 00c39b8d69c45344 PVACTIVE
hdisk2 00c39b8d043427b6 PVMISSING
The disk hdisk2 is PVMISSING. We assume hdisk2 with PVID 00c39b8d043427b6 is physically destroyed.
All the data is lost; however, the AIX ODM and the VGDA on all the other disks in the volume group do not know this yet.
First document what is lost. We need to know which logical volumes are (were) on the missing disk. Normally we could use lspv -l hdiskX;
(new: undocumented variation: lspv -l PVID) however, with the disk missing, this version of the command will not work. Instead, we use the VGID (volume group identifer).
1. Query the VGDA of the working disk to get the VGID and PVID of all disks in the volume group
[root@labotest]/root# lqueryvg -p hdisk1 -vPt
Physical: 00c39b8d69c45344 2 0
00c39b8d043427b6 1 0
VGid: 00c39b8d00004c000000011169c45a4b
2. Get a list of all the logical volumes on the missing disk
[root@labotest]/root# lspv -l -v 00c39b8d00004c000000011169c45a4b hdisk2
hdisk2:
LV NAME LPs PPs DISTRIBUTION MOUNT POINT
lvTest 512 512 109..108..108..108..79 /scratch
loglv00 1 1 00..00..00..00..01 N/A
(Note: lspv -l 00c39b8d043427b6 should give us the same output!)
3. Verify all filesystems are unmounted.
[root@labotest]/root# lsvg -l vgExport
vgExport:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvExport jfs2 416 416 1 closed/syncd /export
lvTest jfs 512 512 1 closed/syncd /scratch
loglv00 jfslog 1 1 1 closed/syncd N/A
With this info I know that any data in /scratch is suspect, and should be restored from a backup.
4. Remove the logical volumes from the volume group before deleting the VGDA from the other disks.
[root@labotest]/root# rmfs /scratch
rmfs: 0506-936 Cannot read superblock on /dev/lvTest.
rmfs: 0506-936 Cannot read superblock on /scratch.
rmfs: Unable to clear superblock on /scratchrmlv: Logical volume lvTest is removed.
[root@labotest]/root# rmlv loglv00
Warning, all data contained on logical volume loglv00 will be destroyed.
rmlv: Do you wish to continue? y(es) n(o)? y
rmlv: Logical volume loglv00 is removed.
[root@labotest]/root# lsvg -p vgExport
vgExport:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk1 active 511 95 00..00..00..00..95
hdisk2 missing 542 29 51..18..51..51..51
[root@labotest]/root# lsvg -l vgExport
vgExport:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvExport jfs2 416 416 1 closed/syncd /export
5. The volume group has been prepared - all damaged logical volume definitions have been removed.
All that is remaining for cleanup is to remove the definition of the damaged disk from the VGDA of the remaining disk(s).
[root@labotest]/root# ldeletepv -g 00c39b8d00004c000000011169c45a4b -p 00c39b8d043427b6
Note: there is no output for the above command when all proceeds accordingly.
Now the regular AIX commands to verify VGDA and ODM are in order.
[root@labotest]/root# lsvg -p vgExport
vgExport:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk1 active 511 95 00..00..00..00..95
[root@labotest]/root# mount /export
[root@labotest]/root# lsvg -l vgExport
vgExport:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvExport jfs2 416 416 1 open/syncd /export
6. Various steps that I will only list here:
a. add a new disk to the volume group (extendvg)
b. remake the deleted logical partitions (mklv)
c. format, as needed, the log logical volumes (logform)
d. create the filesystems (crfs, or use smit)
e. restore the data from a backup (restore, tar, cpio, etc.)
Summary
This procedure is much less error prone than using ODM commands. All the commands demonstrated here have been available in AIX for disk management since at least 1995
(when AIX 4 first came out). They may have been in AIX 3 as well, taking it back to 1991 or earlier.
**Important commands to review**
lspv -l -v VGID hdiskX
lqueryvg
ldeletepv