====== GPFS operations ======
===== Add a new disk to a filesystem or create a new filesystem =====
==== Create NSD (Network Shared Disk) disks (format on GPFS format) ====
Before adding a disk to a GPFS filesystem, you have to give him som parameters:
* name : logical name corresponding to the filesystem, This name can contain only the following characters: 'A' through 'Z', 'a' through 'z', '0' through '9', or '_' (the underscore). All other characters are not valid.
* failuregroup : important setting, Maximum 3 different ID per filesystem, it correspond to copies of data. If no copy is required, then only one failure group...Use for example 2 for the first copy and 3 for the second copy. mmlsdisk will give you the currents failuregroup IDs
* usage : use always dataAndMetadata, except if you want to make tuning, so you could put metadata on high speed disks
* pool : always use system (except for advanced users)
=== identify the SAN disks ===
In non multipathing, ex VMware RDM you can use
[root@gpfs01 ~]# lsscsi -s
[0:2:0:0] disk IBM ServeRAID M1115 2.13 /dev/sda 298GB
[0:2:1:0] disk IBM ServeRAID M1115 2.13 /dev/sdb 198GB
[1:0:0:0] cd/dvd IBM SATA DEVICE 81Y3676 IBD1 /dev/sr0 -
[7:0:0:0] disk IBM 2145 0000 /dev/sdii 274GB
[7:0:0:1] disk IBM 2145 0000 /dev/sdc 21.4GB
[7:0:0:2] disk IBM 2145 0000 /dev/sdd 1.09TB
...
List UUID serial corresponding to
[root@gpfs01 ~]# ll /dev/disk/by-id/
...
0 lrwxrwxrwx 1 root root 10 Sep 30 01:30 dm-uuid-mpath-36005076300810163a00000000000006a -> ../../dm-4
0 lrwxrwxrwx 1 root root 10 Sep 30 01:30 dm-uuid-mpath-36005076300810163a00000000000006b -> ../../dm-5
...
0 lrwxrwxrwx 1 root root 10 Sep 30 15:05 wwn-0x60050764008181c46800000000000058 -> ../../sdml
0 lrwxrwxrwx 1 root root 10 Sep 30 15:05 wwn-0x60050764008181c46800000000000059 -> ../../sdmm
For multipathing devices use
[root@gpfs01 ~]# multipath -ll | egrep "mpath|size" | paste -d " " - -
mpathcu (360050764008181c46800000000000042) dm-126 IBM ,2145 size=256G features='1 queue_if_no_path' hwhandler='0' wp=rw
mpathbp (360050764008181c46800000000000030) dm-23 IBM ,2145 size=1.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
...
Give the right device name after rescan, identify your disk and device name, use **dm-xx**:
[root@gpfs01 scripts]# rescan-scsi-bus.sh -a
[root@gpfs01 ~]# multipath -ll | egrep "mpath|size" | paste -d " " - -
...
mpathbd (360050764008181c46800000000000023) dm-47 IBM ,2145 size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
POOL: system (default)
=== List failure group ===
Check FailureGroup : A is 2 and B is 3
[root@gpfs01 ~]# mmlsdisk gpfs01
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ----------- -------- ----- ------------- ------------ ------------
GPFS_NSD_DATA_B_08 nsd 512 3 Yes Yes ready up system
...
GPFS_NSD_DATA_A_13 nsd 512 2 Yes Yes ready up system
=== identify NSD in use and free disks (new disks) ===
NSD in use
[root@gpfs01 ~]# mmlsnsd -X | grep gpfs01-hb | awk '{print $3}' | sort
/dev/dm-10
/dev/dm-11
/dev/dm-12
/dev/dm-13
...
List all disks
[root@gpfs01 ~]# multipath -ll | egrep "mpath|size" | paste -d " " - - | tr ' ' '\n' | grep 'dm-' | sed 's/^/\/dev\//' | sort
/dev/dm-50
/dev/dm-51
...
**Difference**
multipath -ll | egrep "mpath|size" | paste -d " " - - | tr ' ' '\n' | grep 'dm-' | sed 's/^/\/dev\//' | sort > /tmp/disk_all.txt
mmlsnsd -X | grep gpfs01-hb | awk '{print $3}' | sort > /tmp/disk_nsd.txt
sdiff -sw100 /tmp/disk_all.txt /tmp/disk_nsd.txt
=== Build NSD file ===
Create a text file containing a list of NSD disks, to add, and their characteristics.
[root@gpfs01 scripts]# cat list.disks_CESSHARE.txt
%nsd:
device=/dev/dm-47
nsd=GPFS_NSD_CESSHARE_A_01
servers=gpfs01-hb,gpfs02-hb
usage=dataAndMetadata
failureGroup=2
pool=system
Create the NSD (network shared disk), and also verify the disk
[root@gpfs01 ~]# mmcrnsd -F list.disks_CESSHARE.txt -v yes
mmcrnsd: Processing disk dm-8
mmcrnsd: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
Now disk is formated, but free, not attached to a filesystem
[root@gpfs01 ~]# fdisk -l /dev/dm-47
...
Disk /dev/dm-47: 21.5 GB, 21474836480 bytes, 41943040 sectors
...
Disk label type: gpt
Disk identifier: 7A94FA63-6A6C-4001-89E8-E36D00B3F66E
# Start End Size Type Name
1 48 41942991 20G IBM General Par GPFS:
[root@gpfs01 ~]# mmlsnsd -L
File system Disk name NSD volume ID NSD servers
---------------------------------------------------------------------------------------------
cesshared01lv GPFS_NSD_CESSHARE01 0A0113A15B0BFD87 gpfs01-hb,gpfs02-hb
...
(free disk) GPFS_NSD_CESSHARE_A_01 0A0113A15E2EB417 gpfs01-hb,gpfs02-hb
[root@gpfs01 ~]# mmlsnsd -X
Disk name NSD volume ID Device Devtype Node name Remarks
---------------------------------------------------------------------------------------------------
GPFS_NSD_CESSHARE01 0A0113A15B0BFD87 /dev/dm-9 dmm gpfs01-hb server node
GPFS_NSD_CESSHARE01 0A0113A15B0BFD87 /dev/dm-2 dmm gpfs02-hb server node
GPFS_NSD_CESSHARE_A_01 0A0113A15E2EB417 /dev/dm-47 dmm gpfs01-hb server node
GPFS_NSD_CESSHARE_A_01 0A0113A15E2EB417 /dev/dm-47 dmm gpfs02-hb server node
.....
==== Add the NSD disk to a filesystem ====
=== Add NSD to a current filesystem ===
First list unused disks
[root@gpfs01 ~]# mmlsnsd -F
File system Disk name NSD servers
---------------------------------------------------------------------------
(free disk) GPFS_NSD_CESSHARE_A_01 gpfs01-hb,gpfs02-hb
Create a stanza file like for NSD
[root@gpfs01 scripts]# cat list.disks_CESSHARE.txt
%nsd:
device=/dev/dm-47
nsd=GPFS_NSD_CESSHARE_A_01
servers=gpfs01-hb,gpfs02-hb
usage=dataAndMetadata
failureGroup=2
pool=system
Now add your disk to the filesystem, and rebalance blocs (if 2 copies of data are required, the a second copy will be done)
[root@gpfs01 ~]# mmadddisk /dev/cesshared01lv -F list.disks_CESSHARE.txt -r
The following disks of cesshared01lv will be formatted on node gpfs02:
GPFS_NSD_CESSHARE_A_01: size 20480 MB
Extending Allocation Map
Checking Allocation Map for storage pool system
Completed adding disks to file system cesshared01lv.
mmadddisk: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
Restriping /dev/cesshared01lv ...
Scanning file system metadata, phase 1 ...
100 % complete on Mon Jan 27 11:28:13 2020
Scan completed successfully.
Scanning file system metadata, phase 2 ...
100 % complete on Mon Jan 27 11:28:13 2020
Scan completed successfully.
Scanning file system metadata, phase 3 ...
100 % complete on Mon Jan 27 11:28:13 2020
Scan completed successfully.
Scanning file system metadata, phase 4 ...
100 % complete on Mon Jan 27 11:28:13 2020
Scan completed successfully.
Scanning user file metadata ...
100.00 % complete on Mon Jan 27 11:28:13 2020 ( 65792 inodes with total 404 MB data processed)
Scan completed successfully.
Done
Check number of copy of a file --> only 1 copy of data and metadata ! We will a a second copy using **mmchfs** command, and then restripe to copy the data and metadat on second failuregroup disks, and the second step to optimize data placement:
[root@gpfs01 connections]# mmlsattr /CESshared/ha/nfs/ganesha/gpfs-epoch
replication factors
metadata(max) data(max) file [flags]
------------- --------- ---------------
1 ( 2) 1 ( 2) /CESshared/ha/nfs/ganesha/gpfs-epoch
[root@gpfs01 connections]# mmchfs cesshared01lv -m 2 -r 2
[root@gpfs01 connections]# mmrestripefs cesshared01lv -R
Scanning file system metadata, phase 1 ...
100 % complete on Mon Jan 27 12:50:45 2020
Scan completed successfully.
...
100.00 % complete on Mon Jan 27 12:50:46 2020 ( 65792 inodes with total 808 MB data processed)
Scan completed successfully.
[root@gpfs01 connections]# mmlsattr /CESshared/ha/nfs/ganesha/gpfs-epoch
replication factors
metadata(max) data(max) file [flags]
------------- --------- ---------------
2 ( 2) 2 ( 2) /CESshared/ha/nfs/ganesha/gpfs-epoch [unbalanced]
[root@gpfs01 connections]#
Optimize data placement
[root@gpfs01 connections]# mmrestripefs cesshared01lv -b
Scanning file system metadata, phase 1 ...
100 % complete on Mon Jan 27 12:51:56 2020
Scan completed successfully.
...
100.00 % complete on Mon Jan 27 12:51:57 2020 ( 65792 inodes with total 808 MB data processed)
Scan completed successfully.
[root@gpfs01 connections]# mmlsattr /CESshared/ha/nfs/ganesha/gpfs-epoch
replication factors
metadata(max) data(max) file [flags]
------------- --------- ---------------
2 ( 2) 2 ( 2) /CESshared/ha/nfs/ganesha/gpfs-epoch
=== Add NSD to a new filesystem ===
This a an example of creation of a filesystem with the previously defined NSD disk, bock size 512K, 2 copies of data and metadata, quota is enable,
[root@gpfs01 connections]# mmcrfs cesshared01lv -F list.disks_CESSHARE.txt -B 512K -m 2 -r 2 -Q yes -T /CESshared -v yes -D nfs4 -k nfs4 -A yes
[root@gpfs01 connections]# mmmount cesshared01lv
===== Remove a disk =====
To delete GPFS_NSD_DATA01 from file system gpfs01 and rebalance the files across the remaining disks, issue this command:
[root@gpfs01 ~]# mmlsdisk gpfs01
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ----------- -------- ----- ------------- ------------ ------------
GPFS_NSD_DATA01 nsd 512 2 Yes Yes ready up system
GPFS_NSD_DATA02 nsd 512 2 Yes Yes ready up system
GPFS_NSD_DATA03 nsd 512 2 Yes Yes ready up system
GPFS_NSD_DATA04 nsd 512 2 Yes Yes ready up system
GPFS_NSD_DATA05 nsd 512 2 Yes Yes ready up system
GPFS_NSD_DATA06 nsd 512 2 Yes Yes ready up system
GPFS_NSD_DATA07 nsd 512 2 Yes Yes ready up system
[root@gpfs01 ~]# mmdeldisk gpfs01 GPFS_NSD_DATA01
Now you are able to delete NSD GPFS_NSD_DATA01 from the GPFS cluster, first check if disk is free, then issue this command:
[root@gpfs01 scripts]# mmlsnsd -F
File system Disk name NSD servers
---------------------------------------------------------------------------
(free disk) GPFS_NSD_DATA01 gpfs01-hb,gpfs02-hb
[root@gpfs01 ~]# mmdelnsd GPFS_NSD_DATA01
==== Remove a node from GPFS cluster: ====
* Remove disks that belong to the server you want to remove
[root@labo_2_new]/root# mmlsnsd -m
Disk name NSD volume ID Device Node name Remarks
---------------------------------------------------------------------------------------
diskh1 AC131C344EE5D6E8 /dev/hdisk2 labo_1 server node
diskh1 AC131C344EE5D6E8 /dev/hdisk2 labo_2 server node
diskh2 AC131C344EE600F2 /dev/hdisk4 labo_1 server node
diskh2 AC131C344EE600F2 /dev/hdisk4 labo_2 server node
diskk1 AC131C364EE5D6EC /dev/descgpfs1lv labo_s server node
diskk2 AC131C364EE600F6 /dev/descgpfs2lv labo_s server node
diskr1 AC131C344EE5D6EA /dev/hdisk3 labo_1 server node
diskr1 AC131C344EE5D6EA /dev/hdisk3 labo_2 server node
diskr2 AC131C344EE600F4 /dev/hdisk5 labo_1 server node
diskr2 AC131C344EE600F4 /dev/hdisk5 labo_2 server node
[root@labo_2_new]/root# mmlspv
[root@labo_2_new]/root# mmlsdisk orafs1
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh1 nsd 512 1 yes yes ready up system
diskr1 nsd 512 2 yes yes ready up system
diskk1 nsd 512 3 no no ready up system
[root@labo_2_new]/root# mmdeldisk orafs1 diskk1
Deleting disks ...
Scanning system storage pool
Scanning file system metadata, phase 1 ...
Scan completed successfully.
Scanning file system metadata, phase 2 ...
Scan completed successfully.
Scanning file system metadata, phase 3 ...
Scan completed successfully.
Scanning file system metadata, phase 4 ...
Scan completed successfully.
Scanning user file metadata ...
100.00 % complete on Fri Jan 6 09:52:41 2012
Scan completed successfully.
Checking Allocation Map for storage pool 'system'
tsdeldisk completed.
mmdeldisk: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
[root@labo_2_new]/root# mmlsdisk orafs1
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh1 nsd 512 1 yes yes ready up system
diskr1 nsd 512 2 yes yes ready up system
[root@labo_2_new]/root# mmlsdisk orafs2
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh2 nsd 512 4 yes yes ready up system
diskr2 nsd 512 5 yes yes ready up system
diskk2 nsd 512 6 no no ready up system
[root@labo_2_new]/root# mmdeldisk orafs2 diskk2
Deleting disks ...
Scanning system storage pool
Scanning file system metadata, phase 1 ...
Scan completed successfully.
Scanning file system metadata, phase 2 ...
Scan completed successfully.
Scanning file system metadata, phase 3 ...
Scan completed successfully.
Scanning file system metadata, phase 4 ...
Scan completed successfully.
Scanning user file metadata ...
100.00 % complete on Fri Jan 6 09:55:30 2012
Scan completed successfully.
Checking Allocation Map for storage pool 'system'
tsdeldisk completed.
mmdeldisk: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
[root@labo_2_new]/root# mmlsdisk orafs2
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh2 nsd 512 4 yes yes ready up system
diskr2 nsd 512 5 yes yes ready up system
* Remove NSD that belong to the server you want to remove
[root@labo_2_new]/root# mmlsnsd -m
Disk name NSD volume ID Device Node name Remarks
---------------------------------------------------------------------------------------
diskh1 AC131C344EE5D6E8 /dev/hdisk2 labo_1 server node
diskh1 AC131C344EE5D6E8 /dev/hdisk2 labo_2 server node
diskh2 AC131C344EE600F2 /dev/hdisk4 labo_1 server node
diskh2 AC131C344EE600F2 /dev/hdisk4 labo_2 server node
diskk1 AC131C364EE5D6EC /dev/descgpfs1lv labo_s server node
diskk2 AC131C364EE600F6 /dev/descgpfs2lv labo_s server node
diskr1 AC131C344EE5D6EA /dev/hdisk3 labo_1 server node
diskr1 AC131C344EE5D6EA /dev/hdisk3 labo_2 server node
diskr2 AC131C344EE600F4 /dev/hdisk5 labo_1 server node
diskr2 AC131C344EE600F4 /dev/hdisk5 labo_2 server node
[root@labo_2_new]/root# mmdelnsd "diskk1;diskk2"
mmdelnsd: Processing disk diskk1
mmdelnsd: Processing disk diskk2
mmdelnsd: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
[root@labo_2_new]/root# mmlsnsd -m
Disk name NSD volume ID Device Node name Remarks
---------------------------------------------------------------------------------------
diskh1 AC131C344EE5D6E8 /dev/hdisk2 labo_1 server node
diskh1 AC131C344EE5D6E8 /dev/hdisk2 labo_2 server node
diskh2 AC131C344EE600F2 /dev/hdisk4 labo_1 server node
diskh2 AC131C344EE600F2 /dev/hdisk4 labo_2 server node
diskr1 AC131C344EE5D6EA /dev/hdisk3 labo_1 server node
diskr1 AC131C344EE5D6EA /dev/hdisk3 labo_2 server node
diskr2 AC131C344EE600F4 /dev/hdisk5 labo_1 server node
diskr2 AC131C344EE600F4 /dev/hdisk5 labo_2 server node
* Now The server is still member from GPFS cluster, but without resources:
* Stop GPFS on the member to remove
[root@labo_s_new]/root# mmshutdown
Fri Jan 6 10:03:32 CET 2012: mmshutdown: Starting force unmount of GPFS file systems
Fri Jan 6 10:03:37 CET 2012: mmshutdown: Shutting down GPFS daemons
Shutting down!
'shutdown' command about to kill process 17816
Fri Jan 6 10:03:42 CET 2012: mmshutdown: Finished
* Remove the member from GPFS
[root@labo_2_new]/root# mmlscluster
GPFS cluster information
========================
GPFS cluster name: gpfsOracle.labo_2
GPFS cluster id: 12399285214363632796
GPFS UID domain: gpfsOracle.labo_2
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
GPFS cluster configuration servers:
-----------------------------------
Primary server: labo_2
Secondary server: labo_1
Node Daemon node name IP address Admin node name Designation
-----------------------------------------------------------------------------------------------
1 labo_1 10.10.10.52 labo_1 quorum
2 labo_2 10.10.10.53 labo_2 quorum
3 labo_s 10.10.10.54 labo_s quorum
[root@labo_2_new]/root# mmgetstate -aLs
Node number Node name Quorum Nodes up Total nodes GPFS state Remarks
------------------------------------------------------------------------------------
1 labo_1 2 2 3 active quorum node
2 labo_2 2 2 3 active quorum node
3 labo_s 0 0 3 down quorum node
Summary information
---------------------
Number of nodes defined in the cluster: 3
Number of local nodes active in the cluster: 2
Number of remote nodes joined in this cluster: 0
Number of quorum nodes defined in the cluster: 3
Number of quorum nodes active in the cluster: 2
Quorum = 2, Quorum achieved
[root@labo_2_new]/root# mmdelnode -N labo_s
Verifying GPFS is stopped on all affected nodes ...
mmdelnode: Command successfully completed
mmdelnode: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
[root@labo_2_new]/root# mmlscluster
GPFS cluster information
========================
GPFS cluster name: gpfsOracle.labo_2
GPFS cluster id: 12399285214363632796
GPFS UID domain: gpfsOracle.labo_2
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
GPFS cluster configuration servers:
-----------------------------------
Primary server: labo_2
Secondary server: labo_1
Node Daemon node name IP address Admin node name Designation
-----------------------------------------------------------------------------------------------
1 labo_1 10.10.10.52 labo_1 quorum
2 labo_2 10.10.10.53 labo_2 quorum
[root@labo_2_new]/root# mmgetstate -aLs
Node number Node name Quorum Nodes up Total nodes GPFS state Remarks
------------------------------------------------------------------------------------
1 labo_1 2 2 2 active quorum node
2 labo_2 2 2 2 active quorum node
Summary information
---------------------
Number of nodes defined in the cluster: 2
Number of local nodes active in the cluster: 2
Number of remote nodes joined in this cluster: 0
Number of quorum nodes defined in the cluster: 2
Number of quorum nodes active in the cluster: 2
Quorum = 2, Quorum achieved
==== Add a node to GPFS cluster: ====
* Add a new node to the GPFS cluster: add first the node as nonquorum, and then change it to quorum (else you need to stop the cluster
[root@labo_2_new]/root# mmlscluster
GPFS cluster information
========================
GPFS cluster name: gpfsOracle.labo_2
GPFS cluster id: 12399285214363632796
GPFS UID domain: gpfsOracle.labo_2
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
GPFS cluster configuration servers:
-----------------------------------
Primary server: labo_2
Secondary server: labo_1
Node Daemon node name IP address Admin node name Designation
-----------------------------------------------------------------------------------------------
1 labo_1 10.10.10.52 labo_1 quorum
2 labo_2 10.10.10.53 labo_2 quorum
[root@labo_2_new]/root# mmaddnode -N labo_s:nonquorum
Fri Jan 6 12:37:14 CET 2012: mmaddnode: Processing node labo_s
mmaddnode: Command successfully completed
mmaddnode: Warning: Not all nodes have proper GPFS license designations.
Use the mmchlicense command to designate licenses as needed.
mmaddnode: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
[root@labo_2_new]/root# mmlscluster
===============================================================================
| Warning: |
| This cluster contains nodes that do not have a proper GPFS license |
| designation. This violates the terms of the GPFS licensing agreement. |
| Use the mmchlicense command and assign the appropriate GPFS licenses |
| to each of the nodes in the cluster. For more information about GPFS |
| license designation, see the Concepts, Planning, and Installation Guide. |
===============================================================================
GPFS cluster information
========================
GPFS cluster name: gpfsOracle.labo_2
GPFS cluster id: 12399285214363632796
GPFS UID domain: gpfsOracle.labo_2
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
GPFS cluster configuration servers:
-----------------------------------
Primary server: labo_2
Secondary server: labo_1
Node Daemon node name IP address Admin node name Designation
-----------------------------------------------------------------------------------------------
1 labo_1 10.10.10.52 labo_1 quorum
2 labo_2 10.10.10.53 labo_2 quorum
3 labo_s 10.10.10.54 labo_s
[root@labo_2_new]/root# mmchlicense server --accept -N labo_s
The following nodes will be designated as possessing GPFS server licenses:
labo_s
mmchlicense: Command successfully completed
mmchlicense: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
[root@labo_2_new]/root# mmchnode --quorum -N labo_s
Fri Jan 6 12:39:26 CET 2012: mmchnode: Processing node labo_s
mmchnode: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
[root@labo_2_new]/root# mmlscluster
GPFS cluster information
========================
GPFS cluster name: gpfsOracle.labo_2
GPFS cluster id: 12399285214363632796
GPFS UID domain: gpfsOracle.labo_2
Remote shell command: /usr/bin/ssh
Remote file copy command: /usr/bin/scp
GPFS cluster configuration servers:
-----------------------------------
Primary server: labo_2
Secondary server: labo_1
Node Daemon node name IP address Admin node name Designation
-----------------------------------------------------------------------------------------------
1 labo_1 10.10.10.52 labo_1 quorum
2 labo_2 10.10.10.53 labo_2 quorum
3 labo_s 10.10.10.54 labo_s quorum
[root@labo_2_new]/root# mmgetstate -aLs
Node number Node name Quorum Nodes up Total nodes GPFS state Remarks
------------------------------------------------------------------------------------
1 labo_1 2 2 3 active quorum node
2 labo_2 2 2 3 active quorum node
3 labo_s 0 0 3 down quorum node
Summary information
---------------------
Number of nodes defined in the cluster: 3
Number of local nodes active in the cluster: 2
Number of remote nodes joined in this cluster: 0
Number of quorum nodes defined in the cluster: 3
Number of quorum nodes active in the cluster: 2
Quorum = 2, Quorum achieved
* Start GPFS on the new node:
[root@labo_2_new]/root# mmgetstate -aLs
Node number Node name Quorum Nodes up Total nodes GPFS state Remarks
------------------------------------------------------------------------------------
1 labo_1 2 2 3 active quorum node
2 labo_2 2 2 3 active quorum node
3 labo_s 0 0 3 down quorum node
Summary information
---------------------
Number of nodes defined in the cluster: 3
Number of local nodes active in the cluster: 2
Number of remote nodes joined in this cluster: 0
Number of quorum nodes defined in the cluster: 3
Number of quorum nodes active in the cluster: 2
Quorum = 2, Quorum achieved
[root@labo_s_new]/root# mmstartup
Fri Jan 6 12:40:45 CET 2012: mmstartup: Starting GPFS ...
[root@labo_2_new]/root# mmgetstate -aLs
Node number Node name Quorum Nodes up Total nodes GPFS state Remarks
------------------------------------------------------------------------------------
1 labo_1 2 3 3 active quorum node
2 labo_2 2 3 3 active quorum node
3 labo_s 2 3 3 active quorum node
Summary information
---------------------
Number of nodes defined in the cluster: 3
Number of local nodes active in the cluster: 3
Number of remote nodes joined in this cluster: 0
Number of quorum nodes defined in the cluster: 3
Number of quorum nodes active in the cluster: 3
Quorum = 2, Quorum achieved
* Create NSD description files,and create the NSD
[root@labo_2_new]/root# cat gpfsk_disk1
/dev/descgpfs1lv:labo_s::descOnly:3:diskk1
[root@labo_2_new]/root# mmcrnsd -F gpfsk_disk1
[root@labo_2_new]/root# cat gpfsk_disk2
/dev/descgpfs2lv:labo_s::descOnly:6:diskk2
[root@labo_2_new]/root# mmcrnsd -F gpfsk_disk2
[root@labo_2_new]/root# mmlsnsd
File system Disk name NSD servers
---------------------------------------------------------------------------
orafs1 diskh1 labo_1,labo_2
orafs1 diskr1 labo_1,labo_2
orafs2 diskh2 labo_1,labo_2
orafs2 diskr2 labo_1,labo_2
(free disk) diskk1 labo_s
(free disk) diskk2 labo_s
* Add the new disks into the filesystem, and restripe (-r)
[root@labo_1_new]/kondor# mmlsdisk orafs1
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh1 nsd 512 1 yes yes ready up system
diskr1 nsd 512 2 yes yes ready up system
[root@labo_2_new]/root# mmadddisk orafs1 -F gpfsk_disk1 -r
The following disks of orafs1 will be formatted on node labo_2:
diskk1: size 163840 KB
Extending Allocation Map
Checking Allocation Map for storage pool 'system'
Completed adding disks to file system orafs1.
mmadddisk: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
Restriping orafs1 ...
Scanning file system metadata, phase 1 ...
Scan completed successfully.
Scanning file system metadata, phase 2 ...
Scan completed successfully.
Scanning file system metadata, phase 3 ...
Scan completed successfully.
Scanning file system metadata, phase 4 ...
Scan completed successfully.
Scanning user file metadata ...
100.00 % complete on Fri Jan 6 14:19:56 2012
Scan completed successfully.
Done
[root@labo_1_new]/kondor# mmlsdisk orafs1
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh1 nsd 512 1 yes yes ready up system
diskr1 nsd 512 2 yes yes ready up system
diskk1 nsd 512 3 no no ready up system
[root@labo_1_new]/kondor# mmlsdisk orafs2
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh2 nsd 512 4 yes yes ready up system
diskr2 nsd 512 5 yes yes ready up system
[root@labo_2_new]/root# mmadddisk orafs2 -F gpfsk_disk2 -r
The following disks of orafs2 will be formatted on node labo_1:
diskk2: size 163840 KB
Extending Allocation Map
Checking Allocation Map for storage pool 'system'
Completed adding disks to file system orafs2.
mmadddisk: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
Restriping orafs2 ...
Scanning file system metadata, phase 1 ...
Scan completed successfully.
Scanning file system metadata, phase 2 ...
Scan completed successfully.
Scanning file system metadata, phase 3 ...
Scan completed successfully.
Scanning file system metadata, phase 4 ...
Scan completed successfully.
Scanning user file metadata ...
100.00 % complete on Fri Jan 6 14:21:17 2012
Scan completed successfully.
Done
[root@labo_1_new]/kondor# mmlsdisk orafs2
disk driver sector failure holds holds storage
name type size group metadata data status availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh2 nsd 512 4 yes yes ready up system
diskr2 nsd 512 5 yes yes ready up system
diskk2 nsd 512 6 no no ready up system
* Change the parameter to prevent the new node to umount the filesystem in case of failure:
[root@labo_2_new]/root# mmlsconfig
Configuration data for cluster gpfsOracle.labo_2:
---------------------------------------------------
clusterName gpfsOracle.labo_2
clusterId 12399285214363632796
autoload yes
minReleaseLevel 3.3.0.2
dmapiFileHandleSize 32
unmountOnDiskFail no
maxMBpS 300
pagepool 256M
adminMode central
File systems in cluster gpfsOracle.labo_2:
--------------------------------------------
/dev/orafs1
/dev/orafs2
[root@labo_2_new]/root# mmchconfig unmountOnDiskFail=yes labo_s
mmchconfig: Command successfully completed
mmchconfig: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
[root@labo_2_new]/root# mmlsconfig
Configuration data for cluster gpfsOracle.labo_2:
---------------------------------------------------
clusterName gpfsOracle.labo_2
clusterId 12399285214363632796
autoload yes
minReleaseLevel 3.3.0.2
dmapiFileHandleSize 32
unmountOnDiskFail no
[labo_s]
unmountOnDiskFail yes
[common]
maxMBpS 300
pagepool 256M
adminMode central
File systems in cluster gpfsOracle.labo_2:
--------------------------------------------
/dev/orafs1
/dev/orafs2
===== Remove a node from a cluster =====
# mmchnsd "GPFS_NSD_M_B_0002:prscale-b-01"
# mmchnode --noperfmon -N prscale-b-02
# mmchnode --ces-disable -N prscale-b-02
# mmperfmon config update --collectors prscale-b-02
# mmchnode --nonquorum -N prscale-b-02
# mmchnode --nomanager -N prscale-b-02
# mmdelnode -N prscale-b-02