User Tools

Site Tools


gpfs:gpfs_operation

GPFS operations

Add a new disk to a filesystem or create a new filesystem

Create NSD (Network Shared Disk) disks (format on GPFS format)

Before adding a disk to a GPFS filesystem, you have to give him som parameters:

  • name : logical name corresponding to the filesystem, This name can contain only the following characters: 'A' through 'Z', 'a' through 'z', '0' through '9', or '_' (the underscore). All other characters are not valid.
  • failuregroup : important setting, Maximum 3 different ID per filesystem, it correspond to copies of data. If no copy is required, then only one failure group…Use for example 2 for the first copy and 3 for the second copy. mmlsdisk will give you the currents failuregroup IDs
  • usage : use always dataAndMetadata, except if you want to make tuning, so you could put metadata on high speed disks
  • pool : always use system (except for advanced users)

identify the SAN disks

In non multipathing, ex VMware RDM you can use

[root@gpfs01 ~]# lsscsi -s
[0:2:0:0]    disk    IBM      ServeRAID M1115  2.13  /dev/sda    298GB
[0:2:1:0]    disk    IBM      ServeRAID M1115  2.13  /dev/sdb    198GB
[1:0:0:0]    cd/dvd  IBM SATA DEVICE 81Y3676   IBD1  /dev/sr0        -
[7:0:0:0]    disk    IBM      2145             0000  /dev/sdii   274GB
[7:0:0:1]    disk    IBM      2145             0000  /dev/sdc   21.4GB
[7:0:0:2]    disk    IBM      2145             0000  /dev/sdd   1.09TB
...

List UUID serial corresponding to

[root@gpfs01 ~]# ll /dev/disk/by-id/
...
0 lrwxrwxrwx 1 root root   10 Sep 30 01:30 dm-uuid-mpath-36005076300810163a00000000000006a -> ../../dm-4
0 lrwxrwxrwx 1 root root   10 Sep 30 01:30 dm-uuid-mpath-36005076300810163a00000000000006b -> ../../dm-5
...
0 lrwxrwxrwx 1 root root   10 Sep 30 15:05 wwn-0x60050764008181c46800000000000058 -> ../../sdml
0 lrwxrwxrwx 1 root root   10 Sep 30 15:05 wwn-0x60050764008181c46800000000000059 -> ../../sdmm

For multipathing devices use

[root@gpfs01 ~]# multipath -ll | egrep "mpath|size" | paste -d " "  - -
mpathcu (360050764008181c46800000000000042) dm-126 IBM     ,2145             size=256G features='1 queue_if_no_path' hwhandler='0' wp=rw
mpathbp (360050764008181c46800000000000030) dm-23 IBM     ,2145             size=1.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
...

Give the right device name after rescan, identify your disk and device name, use dm-xx:

[root@gpfs01 scripts]# rescan-scsi-bus.sh -a
[root@gpfs01 ~]# multipath -ll | egrep "mpath|size" | paste -d " "  - -
...
mpathbd (360050764008181c46800000000000023) dm-47 IBM     ,2145                   size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw

POOL: system (default)

List failure group

Check FailureGroup : A is 2 and B is 3

[root@gpfs01 ~]# mmlsdisk gpfs01
disk         driver   sector     failure holds    holds                            storage
name         type       size       group metadata data  status        availability pool
------------ -------- ------ ----------- -------- ----- ------------- ------------ ------------
GPFS_NSD_DATA_B_08 nsd         512           3 Yes      Yes   ready         up           system

...
GPFS_NSD_DATA_A_13 nsd         512           2 Yes      Yes   ready         up           system

identify NSD in use and free disks (new disks)

NSD in use

[root@gpfs01 ~]# mmlsnsd -X | grep gpfs01-hb |  awk '{print $3}' |  sort
/dev/dm-10
/dev/dm-11
/dev/dm-12
/dev/dm-13
...

List all disks

[root@gpfs01 ~]# multipath -ll | egrep "mpath|size" | paste -d " "  - - | tr ' ' '\n' | grep 'dm-' | sed 's/^/\/dev\//' | sort
/dev/dm-50
/dev/dm-51
...

Difference

multipath -ll | egrep "mpath|size" | paste -d " "  - - | tr ' ' '\n' | grep 'dm-' | sed 's/^/\/dev\//' | sort > /tmp/disk_all.txt
mmlsnsd -X | grep gpfs01-hb |  awk '{print $3}' |  sort > /tmp/disk_nsd.txt
sdiff -sw100 /tmp/disk_all.txt /tmp/disk_nsd.txt

Build NSD file

Create a text file containing a list of NSD disks, to add, and their characteristics.

[root@gpfs01 scripts]# cat list.disks_CESSHARE.txt
%nsd:
device=/dev/dm-47
nsd=GPFS_NSD_CESSHARE_A_01
servers=gpfs01-hb,gpfs02-hb
usage=dataAndMetadata
failureGroup=2
pool=system

Create the NSD (network shared disk), and also verify the disk

[root@gpfs01 ~]# mmcrnsd -F list.disks_CESSHARE.txt -v yes
mmcrnsd: Processing disk dm-8
mmcrnsd: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.

Now disk is formated, but free, not attached to a filesystem

[root@gpfs01 ~]# fdisk -l /dev/dm-47
...
Disk /dev/dm-47: 21.5 GB, 21474836480 bytes, 41943040 sectors
...
Disk label type: gpt
Disk identifier: 7A94FA63-6A6C-4001-89E8-E36D00B3F66E

#         Start          End    Size  Type            Name
 1           48     41942991     20G  IBM General Par GPFS:
[root@gpfs01 ~]# mmlsnsd -L

 File system   Disk name    NSD volume ID      NSD servers
---------------------------------------------------------------------------------------------
 cesshared01lv GPFS_NSD_CESSHARE01 0A0113A15B0BFD87   gpfs01-hb,gpfs02-hb
...
 (free disk)   GPFS_NSD_CESSHARE_A_01 0A0113A15E2EB417   gpfs01-hb,gpfs02-hb

[root@gpfs01 ~]# mmlsnsd -X

 Disk name    NSD volume ID      Device         Devtype  Node name                Remarks
---------------------------------------------------------------------------------------------------
 GPFS_NSD_CESSHARE01 0A0113A15B0BFD87   /dev/dm-9      dmm      gpfs01-hb                server node
 GPFS_NSD_CESSHARE01 0A0113A15B0BFD87   /dev/dm-2      dmm      gpfs02-hb                server node
 GPFS_NSD_CESSHARE_A_01 0A0113A15E2EB417   /dev/dm-47     dmm      gpfs01-hb                server node
 GPFS_NSD_CESSHARE_A_01 0A0113A15E2EB417   /dev/dm-47     dmm      gpfs02-hb                server node
.....

Add the NSD disk to a filesystem

Add NSD to a current filesystem

First list unused disks

[root@gpfs01 ~]# mmlsnsd -F
 File system   Disk name    NSD servers
---------------------------------------------------------------------------
 (free disk)   GPFS_NSD_CESSHARE_A_01 gpfs01-hb,gpfs02-hb

Create a stanza file like for NSD

[root@gpfs01 scripts]# cat list.disks_CESSHARE.txt
%nsd:
device=/dev/dm-47
nsd=GPFS_NSD_CESSHARE_A_01
servers=gpfs01-hb,gpfs02-hb
usage=dataAndMetadata
failureGroup=2
pool=system

Now add your disk to the filesystem, and rebalance blocs (if 2 copies of data are required, the a second copy will be done)

[root@gpfs01 ~]# mmadddisk /dev/cesshared01lv -F list.disks_CESSHARE.txt -r

The following disks of cesshared01lv will be formatted on node gpfs02:
    GPFS_NSD_CESSHARE_A_01: size 20480 MB
Extending Allocation Map
Checking Allocation Map for storage pool system
Completed adding disks to file system cesshared01lv.
mmadddisk: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
Restriping /dev/cesshared01lv ...
Scanning file system metadata, phase 1 ...
 100 % complete on Mon Jan 27 11:28:13 2020
Scan completed successfully.
Scanning file system metadata, phase 2 ...
 100 % complete on Mon Jan 27 11:28:13 2020
Scan completed successfully.
Scanning file system metadata, phase 3 ...
 100 % complete on Mon Jan 27 11:28:13 2020
Scan completed successfully.
Scanning file system metadata, phase 4 ...
 100 % complete on Mon Jan 27 11:28:13 2020
Scan completed successfully.
Scanning user file metadata ...
 100.00 % complete on Mon Jan 27 11:28:13 2020  (     65792 inodes with total        404 MB data processed)
Scan completed successfully.
Done

Check number of copy of a file –> only 1 copy of data and metadata ! We will a a second copy using mmchfs command, and then restripe to copy the data and metadat on second failuregroup disks, and the second step to optimize data placement:

[root@gpfs01 connections]# mmlsattr /CESshared/ha/nfs/ganesha/gpfs-epoch
  replication factors
metadata(max) data(max) file    [flags]
------------- --------- ---------------
      1 (  2)   1 (  2) /CESshared/ha/nfs/ganesha/gpfs-epoch

[root@gpfs01 connections]# mmchfs cesshared01lv -m 2 -r 2

[root@gpfs01 connections]# mmrestripefs  cesshared01lv -R
Scanning file system metadata, phase 1 ...
 100 % complete on Mon Jan 27 12:50:45 2020
Scan completed successfully.
...
 100.00 % complete on Mon Jan 27 12:50:46 2020  (     65792 inodes with total        808 MB data processed)
Scan completed successfully.
[root@gpfs01 connections]# mmlsattr /CESshared/ha/nfs/ganesha/gpfs-epoch
  replication factors
metadata(max) data(max) file    [flags]
------------- --------- ---------------
      2 (  2)   2 (  2) /CESshared/ha/nfs/ganesha/gpfs-epoch    [unbalanced]
[root@gpfs01 connections]#

Optimize data placement

[root@gpfs01 connections]# mmrestripefs  cesshared01lv -b
Scanning file system metadata, phase 1 ...
 100 % complete on Mon Jan 27 12:51:56 2020
Scan completed successfully.
...
 100.00 % complete on Mon Jan 27 12:51:57 2020  (     65792 inodes with total        808 MB data processed)
Scan completed successfully.
[root@gpfs01 connections]# mmlsattr /CESshared/ha/nfs/ganesha/gpfs-epoch
  replication factors
metadata(max) data(max) file    [flags]
------------- --------- ---------------
      2 (  2)   2 (  2) /CESshared/ha/nfs/ganesha/gpfs-epoch

Add NSD to a new filesystem

This a an example of creation of a filesystem with the previously defined NSD disk, bock size 512K, 2 copies of data and metadata, quota is enable,

[root@gpfs01 connections]# mmcrfs cesshared01lv -F list.disks_CESSHARE.txt -B 512K -m 2 -r 2 -Q yes -T /CESshared -v yes -D nfs4 -k nfs4 -A yes
[root@gpfs01 connections]# mmmount cesshared01lv

Remove a disk

To delete GPFS_NSD_DATA01 from file system gpfs01 and rebalance the files across the remaining disks, issue this command:

[root@gpfs01 ~]# mmlsdisk gpfs01
disk         driver   sector     failure holds    holds                            storage
name         type       size       group metadata data  status        availability pool
------------ -------- ------ ----------- -------- ----- ------------- ------------ ------------
GPFS_NSD_DATA01 nsd         512           2 Yes      Yes   ready         up           system
GPFS_NSD_DATA02 nsd         512           2 Yes      Yes   ready         up           system
GPFS_NSD_DATA03 nsd         512           2 Yes      Yes   ready         up           system
GPFS_NSD_DATA04 nsd         512           2 Yes      Yes   ready         up           system
GPFS_NSD_DATA05 nsd         512           2 Yes      Yes   ready         up           system
GPFS_NSD_DATA06 nsd         512           2 Yes      Yes   ready         up           system
GPFS_NSD_DATA07 nsd         512           2 Yes      Yes   ready         up           system

[root@gpfs01 ~]# mmdeldisk gpfs01 GPFS_NSD_DATA01

Now you are able to delete NSD GPFS_NSD_DATA01 from the GPFS cluster, first check if disk is free, then issue this command:

[root@gpfs01 scripts]# mmlsnsd -F

 File system   Disk name    NSD servers
---------------------------------------------------------------------------
 (free disk)   GPFS_NSD_DATA01 gpfs01-hb,gpfs02-hb

[root@gpfs01 ~]# mmdelnsd GPFS_NSD_DATA01

Remove a node from GPFS cluster:

  • Remove disks that belong to the server you want to remove
[root@labo_2_new]/root# mmlsnsd -m

 Disk name    NSD volume ID      Device         Node name                Remarks       
---------------------------------------------------------------------------------------
 diskh1       AC131C344EE5D6E8   /dev/hdisk2    labo_1                 server node
 diskh1       AC131C344EE5D6E8   /dev/hdisk2    labo_2                 server node
 diskh2       AC131C344EE600F2   /dev/hdisk4    labo_1                 server node
 diskh2       AC131C344EE600F2   /dev/hdisk4    labo_2                 server node
 diskk1       AC131C364EE5D6EC   /dev/descgpfs1lv labo_s                 server node
 diskk2       AC131C364EE600F6   /dev/descgpfs2lv labo_s                 server node
 diskr1       AC131C344EE5D6EA   /dev/hdisk3    labo_1                 server node
 diskr1       AC131C344EE5D6EA   /dev/hdisk3    labo_2                 server node
 diskr2       AC131C344EE600F4   /dev/hdisk5    labo_1                 server node
 diskr2       AC131C344EE600F4   /dev/hdisk5    labo_2                 server node

[root@labo_2_new]/root# mmlspv


[root@labo_2_new]/root# mmlsdisk orafs1
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh1       nsd         512       1 yes      yes   ready         up           system       
diskr1       nsd         512       2 yes      yes   ready         up           system       
diskk1       nsd         512       3 no       no    ready         up           system       

[root@labo_2_new]/root# mmdeldisk orafs1 diskk1
Deleting disks ...
Scanning system storage pool
Scanning file system metadata, phase 1 ... 
Scan completed successfully.
Scanning file system metadata, phase 2 ... 
Scan completed successfully.
Scanning file system metadata, phase 3 ... 
Scan completed successfully.
Scanning file system metadata, phase 4 ... 
Scan completed successfully.
Scanning user file metadata ...
 100.00 % complete on Fri Jan  6 09:52:41 2012
Scan completed successfully.
Checking Allocation Map for storage pool 'system'
tsdeldisk completed.
mmdeldisk: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.

[root@labo_2_new]/root# mmlsdisk orafs1
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh1       nsd         512       1 yes      yes   ready         up           system       
diskr1       nsd         512       2 yes      yes   ready         up           system       

[root@labo_2_new]/root# mmlsdisk orafs2
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh2       nsd         512       4 yes      yes   ready         up           system       
diskr2       nsd         512       5 yes      yes   ready         up           system       
diskk2       nsd         512       6 no       no    ready         up           system       
[root@labo_2_new]/root# mmdeldisk orafs2 diskk2
Deleting disks ...
Scanning system storage pool
Scanning file system metadata, phase 1 ... 
Scan completed successfully.
Scanning file system metadata, phase 2 ... 
Scan completed successfully.
Scanning file system metadata, phase 3 ... 
Scan completed successfully.
Scanning file system metadata, phase 4 ... 
Scan completed successfully.
Scanning user file metadata ...
 100.00 % complete on Fri Jan  6 09:55:30 2012
Scan completed successfully.
Checking Allocation Map for storage pool 'system'
tsdeldisk completed.
mmdeldisk: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@labo_2_new]/root# mmlsdisk orafs2
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh2       nsd         512       4 yes      yes   ready         up           system       
diskr2       nsd         512       5 yes      yes   ready         up           system       
  • Remove NSD that belong to the server you want to remove
[root@labo_2_new]/root# mmlsnsd -m

 Disk name    NSD volume ID      Device         Node name                Remarks       
---------------------------------------------------------------------------------------
 diskh1       AC131C344EE5D6E8   /dev/hdisk2    labo_1                 server node
 diskh1       AC131C344EE5D6E8   /dev/hdisk2    labo_2                 server node
 diskh2       AC131C344EE600F2   /dev/hdisk4    labo_1                 server node
 diskh2       AC131C344EE600F2   /dev/hdisk4    labo_2                 server node
 diskk1       AC131C364EE5D6EC   /dev/descgpfs1lv labo_s                 server node
 diskk2       AC131C364EE600F6   /dev/descgpfs2lv labo_s                 server node
 diskr1       AC131C344EE5D6EA   /dev/hdisk3    labo_1                 server node
 diskr1       AC131C344EE5D6EA   /dev/hdisk3    labo_2                 server node
 diskr2       AC131C344EE600F4   /dev/hdisk5    labo_1                 server node
 diskr2       AC131C344EE600F4   /dev/hdisk5    labo_2                 server node


[root@labo_2_new]/root# mmdelnsd "diskk1;diskk2"
mmdelnsd: Processing disk diskk1
mmdelnsd: Processing disk diskk2
mmdelnsd: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@labo_2_new]/root# mmlsnsd -m

 Disk name    NSD volume ID      Device         Node name                Remarks       
---------------------------------------------------------------------------------------
 diskh1       AC131C344EE5D6E8   /dev/hdisk2    labo_1                 server node
 diskh1       AC131C344EE5D6E8   /dev/hdisk2    labo_2                 server node
 diskh2       AC131C344EE600F2   /dev/hdisk4    labo_1                 server node
 diskh2       AC131C344EE600F2   /dev/hdisk4    labo_2                 server node
 diskr1       AC131C344EE5D6EA   /dev/hdisk3    labo_1                 server node
 diskr1       AC131C344EE5D6EA   /dev/hdisk3    labo_2                 server node
 diskr2       AC131C344EE600F4   /dev/hdisk5    labo_1                 server node
 diskr2       AC131C344EE600F4   /dev/hdisk5    labo_2                 server node
  • Now The server is still member from GPFS cluster, but without resources:
  • Stop GPFS on the member to remove
[root@labo_s_new]/root# mmshutdown
Fri Jan  6 10:03:32 CET 2012: mmshutdown: Starting force unmount of GPFS file systems
Fri Jan  6 10:03:37 CET 2012: mmshutdown: Shutting down GPFS daemons
Shutting down!
'shutdown' command about to kill process 17816
Fri Jan  6 10:03:42 CET 2012: mmshutdown: Finished
  • Remove the member from GPFS
[root@labo_2_new]/root# mmlscluster

GPFS cluster information
========================
  GPFS cluster name:         gpfsOracle.labo_2
  GPFS cluster id:           12399285214363632796
  GPFS UID domain:           gpfsOracle.labo_2
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp

GPFS cluster configuration servers:
-----------------------------------
  Primary server:    labo_2
  Secondary server:  labo_1

 Node  Daemon node name            IP address       Admin node name             Designation    
-----------------------------------------------------------------------------------------------
   1   labo_1                    10.10.10.52     labo_1                    quorum
   2   labo_2                    10.10.10.53     labo_2                    quorum
   3   labo_s                    10.10.10.54     labo_s                    quorum

[root@labo_2_new]/root# mmgetstate -aLs

 Node number  Node name       Quorum  Nodes up  Total nodes  GPFS state  Remarks    
------------------------------------------------------------------------------------
       1      labo_1           2        2          3       active      quorum node
       2      labo_2           2        2          3       active      quorum node
       3      labo_s           0        0          3       down        quorum node

 Summary information 
---------------------
Number of nodes defined in the cluster:            3
Number of local nodes active in the cluster:       2
Number of remote nodes joined in this cluster:     0
Number of quorum nodes defined in the cluster:     3
Number of quorum nodes active in the cluster:      2
Quorum = 2, Quorum achieved

[root@labo_2_new]/root# mmdelnode -N labo_s
Verifying GPFS is stopped on all affected nodes ...
mmdelnode: Command successfully completed
mmdelnode: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@labo_2_new]/root# mmlscluster

GPFS cluster information
========================
  GPFS cluster name:         gpfsOracle.labo_2
  GPFS cluster id:           12399285214363632796
  GPFS UID domain:           gpfsOracle.labo_2
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp

GPFS cluster configuration servers:
-----------------------------------
  Primary server:    labo_2
  Secondary server:  labo_1

 Node  Daemon node name            IP address       Admin node name             Designation    
-----------------------------------------------------------------------------------------------
   1   labo_1                    10.10.10.52     labo_1                    quorum
   2   labo_2                    10.10.10.53     labo_2                    quorum

[root@labo_2_new]/root# mmgetstate -aLs

 Node number  Node name       Quorum  Nodes up  Total nodes  GPFS state  Remarks    
------------------------------------------------------------------------------------
       1      labo_1           2        2          2       active      quorum node
       2      labo_2           2        2          2       active      quorum node

 Summary information 
---------------------
Number of nodes defined in the cluster:            2
Number of local nodes active in the cluster:       2
Number of remote nodes joined in this cluster:     0
Number of quorum nodes defined in the cluster:     2
Number of quorum nodes active in the cluster:      2
Quorum = 2, Quorum achieved

Add a node to GPFS cluster:

  • Add a new node to the GPFS cluster: add first the node as nonquorum, and then change it to quorum (else you need to stop the cluster
[root@labo_2_new]/root# mmlscluster

GPFS cluster information
========================
  GPFS cluster name:         gpfsOracle.labo_2
  GPFS cluster id:           12399285214363632796
  GPFS UID domain:           gpfsOracle.labo_2
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp

GPFS cluster configuration servers:
-----------------------------------
  Primary server:    labo_2
  Secondary server:  labo_1

 Node  Daemon node name            IP address       Admin node name             Designation    
-----------------------------------------------------------------------------------------------
   1   labo_1                    10.10.10.52     labo_1                    quorum
   2   labo_2                    10.10.10.53     labo_2                    quorum

[root@labo_2_new]/root# mmaddnode -N labo_s:nonquorum
Fri Jan  6 12:37:14 CET 2012: mmaddnode: Processing node labo_s
mmaddnode: Command successfully completed
mmaddnode: Warning: Not all nodes have proper GPFS license designations.
    Use the mmchlicense command to designate licenses as needed.
mmaddnode: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@labo_2_new]/root# mmlscluster

===============================================================================
| Warning:                                                                    |
|   This cluster contains nodes that do not have a proper GPFS license        |
|   designation.  This violates the terms of the GPFS licensing agreement.    |
|   Use the mmchlicense command and assign the appropriate GPFS licenses      |
|   to each of the nodes in the cluster.  For more information about GPFS     |
|   license designation, see the Concepts, Planning, and Installation Guide.  |
===============================================================================


GPFS cluster information
========================
  GPFS cluster name:         gpfsOracle.labo_2
  GPFS cluster id:           12399285214363632796
  GPFS UID domain:           gpfsOracle.labo_2
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp

GPFS cluster configuration servers:
-----------------------------------
  Primary server:    labo_2
  Secondary server:  labo_1

 Node  Daemon node name            IP address       Admin node name             Designation    
-----------------------------------------------------------------------------------------------
   1   labo_1                    10.10.10.52     labo_1                    quorum
   2   labo_2                    10.10.10.53     labo_2                    quorum
   3   labo_s                    10.10.10.54     labo_s                    

[root@labo_2_new]/root# mmchlicense server --accept -N labo_s

The following nodes will be designated as possessing GPFS server licenses:
        labo_s
mmchlicense: Command successfully completed
mmchlicense: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.

[root@labo_2_new]/root# mmchnode --quorum -N labo_s
Fri Jan  6 12:39:26 CET 2012: mmchnode: Processing node labo_s
mmchnode: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@labo_2_new]/root# mmlscluster                      

GPFS cluster information
========================
  GPFS cluster name:         gpfsOracle.labo_2
  GPFS cluster id:           12399285214363632796
  GPFS UID domain:           gpfsOracle.labo_2
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp

GPFS cluster configuration servers:
-----------------------------------
  Primary server:    labo_2
  Secondary server:  labo_1

 Node  Daemon node name            IP address       Admin node name             Designation    
-----------------------------------------------------------------------------------------------
   1   labo_1                    10.10.10.52     labo_1                    quorum
   2   labo_2                    10.10.10.53     labo_2                    quorum
   3   labo_s                    10.10.10.54     labo_s                    quorum
[root@labo_2_new]/root# mmgetstate -aLs

 Node number  Node name       Quorum  Nodes up  Total nodes  GPFS state  Remarks    
------------------------------------------------------------------------------------
       1      labo_1           2        2          3       active      quorum node
       2      labo_2           2        2          3       active      quorum node
       3      labo_s           0        0          3       down        quorum node

 Summary information 
---------------------
Number of nodes defined in the cluster:            3
Number of local nodes active in the cluster:       2
Number of remote nodes joined in this cluster:     0
Number of quorum nodes defined in the cluster:     3
Number of quorum nodes active in the cluster:      2
Quorum = 2, Quorum achieved
  • Start GPFS on the new node:
[root@labo_2_new]/root# mmgetstate -aLs

 Node number  Node name       Quorum  Nodes up  Total nodes  GPFS state  Remarks    
------------------------------------------------------------------------------------
       1      labo_1           2        2          3       active      quorum node
       2      labo_2           2        2          3       active      quorum node
       3      labo_s           0        0          3       down        quorum node

 Summary information 
---------------------
Number of nodes defined in the cluster:            3
Number of local nodes active in the cluster:       2
Number of remote nodes joined in this cluster:     0
Number of quorum nodes defined in the cluster:     3
Number of quorum nodes active in the cluster:      2
Quorum = 2, Quorum achieved

[root@labo_s_new]/root# mmstartup
Fri Jan  6 12:40:45 CET 2012: mmstartup: Starting GPFS ...

[root@labo_2_new]/root# mmgetstate -aLs

 Node number  Node name       Quorum  Nodes up  Total nodes  GPFS state  Remarks    
------------------------------------------------------------------------------------
       1      labo_1           2        3          3       active      quorum node
       2      labo_2           2        3          3       active      quorum node
       3      labo_s           2        3          3       active      quorum node

 Summary information 
---------------------
Number of nodes defined in the cluster:            3
Number of local nodes active in the cluster:       3
Number of remote nodes joined in this cluster:     0
Number of quorum nodes defined in the cluster:     3
Number of quorum nodes active in the cluster:      3
Quorum = 2, Quorum achieved
  • Create NSD description files,and create the NSD
[root@labo_2_new]/root# cat gpfsk_disk1
/dev/descgpfs1lv:labo_s::descOnly:3:diskk1
[root@labo_2_new]/root# mmcrnsd -F gpfsk_disk1    
[root@labo_2_new]/root# cat gpfsk_disk2
/dev/descgpfs2lv:labo_s::descOnly:6:diskk2
[root@labo_2_new]/root# mmcrnsd -F gpfsk_disk2

[root@labo_2_new]/root# mmlsnsd

 File system   Disk name    NSD servers                                    
---------------------------------------------------------------------------
 orafs1       diskh1       labo_1,labo_2        
 orafs1       diskr1       labo_1,labo_2        
 orafs2       diskh2       labo_1,labo_2        
 orafs2       diskr2       labo_1,labo_2        
 (free disk)   diskk1       labo_s                 
 (free disk)   diskk2       labo_s       
  • Add the new disks into the filesystem, and restripe (-r)
[root@labo_1_new]/kondor# mmlsdisk orafs1
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh1       nsd         512       1 yes      yes   ready         up           system       
diskr1       nsd         512       2 yes      yes   ready         up           system       

[root@labo_2_new]/root# mmadddisk orafs1 -F gpfsk_disk1 -r

The following disks of orafs1 will be formatted on node labo_2:
    diskk1: size 163840 KB
Extending Allocation Map
Checking Allocation Map for storage pool 'system'
Completed adding disks to file system orafs1.
mmadddisk: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
Restriping orafs1 ...
Scanning file system metadata, phase 1 ... 
Scan completed successfully.
Scanning file system metadata, phase 2 ... 
Scan completed successfully.
Scanning file system metadata, phase 3 ... 
Scan completed successfully.
Scanning file system metadata, phase 4 ... 
Scan completed successfully.
Scanning user file metadata ...
 100.00 % complete on Fri Jan  6 14:19:56 2012
Scan completed successfully.
Done

[root@labo_1_new]/kondor# mmlsdisk orafs1
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh1       nsd         512       1 yes      yes   ready         up           system       
diskr1       nsd         512       2 yes      yes   ready         up           system       
diskk1       nsd         512       3 no       no    ready         up           system     

[root@labo_1_new]/kondor# mmlsdisk orafs2
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh2       nsd         512       4 yes      yes   ready         up           system       
diskr2       nsd         512       5 yes      yes   ready         up           system       

[root@labo_2_new]/root# mmadddisk orafs2 -F gpfsk_disk2 -r

The following disks of orafs2 will be formatted on node labo_1:
    diskk2: size 163840 KB
Extending Allocation Map
Checking Allocation Map for storage pool 'system'
Completed adding disks to file system orafs2.
mmadddisk: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
Restriping orafs2 ...
Scanning file system metadata, phase 1 ... 
Scan completed successfully.
Scanning file system metadata, phase 2 ... 
Scan completed successfully.
Scanning file system metadata, phase 3 ... 
Scan completed successfully.
Scanning file system metadata, phase 4 ... 
Scan completed successfully.
Scanning user file metadata ...
 100.00 % complete on Fri Jan  6 14:21:17 2012
Scan completed successfully.
Done

[root@labo_1_new]/kondor# mmlsdisk orafs2
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh2       nsd         512       4 yes      yes   ready         up           system       
diskr2       nsd         512       5 yes      yes   ready         up           system       
diskk2       nsd         512       6 no       no    ready         up           system       
  • Change the parameter to prevent the new node to umount the filesystem in case of failure:
[root@labo_2_new]/root# mmlsconfig
Configuration data for cluster gpfsOracle.labo_2:
---------------------------------------------------
clusterName gpfsOracle.labo_2
clusterId 12399285214363632796
autoload yes
minReleaseLevel 3.3.0.2
dmapiFileHandleSize 32
unmountOnDiskFail no
maxMBpS 300
pagepool 256M
adminMode central

File systems in cluster gpfsOracle.labo_2:
--------------------------------------------
/dev/orafs1
/dev/orafs2
[root@labo_2_new]/root# mmchconfig unmountOnDiskFail=yes labo_s     
mmchconfig: Command successfully completed
mmchconfig: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@labo_2_new]/root# mmlsconfig
Configuration data for cluster gpfsOracle.labo_2:
---------------------------------------------------
clusterName gpfsOracle.labo_2
clusterId 12399285214363632796
autoload yes
minReleaseLevel 3.3.0.2
dmapiFileHandleSize 32
unmountOnDiskFail no
[labo_s]
unmountOnDiskFail yes
[common]
maxMBpS 300
pagepool 256M
adminMode central

File systems in cluster gpfsOracle.labo_2:
--------------------------------------------
/dev/orafs1
/dev/orafs2

Remove a node from a cluster

# mmchnsd "GPFS_NSD_M_B_0002:prscale-b-01"
# mmchnode --noperfmon -N prscale-b-02
# mmchnode --ces-disable -N prscale-b-02
# mmperfmon config update --collectors prscale-b-02
# mmchnode --nonquorum -N prscale-b-02
# mmchnode --nomanager -N prscale-b-02
# mmdelnode -N prscale-b-02
gpfs/gpfs_operation.txt · Last modified: 2021/11/24 12:10 by manu