User Tools

Site Tools


gpfs:gpfs_operation

This is an old revision of the document!


GPFS operations

For the cluster Oracle in AIX 5.3 32bits, the latest GPFS supported version is 3.3.

A snapshot don't need the same space as intial filesystem. If it's full, then it die.

Here we only explain commands only for snapshot on a separarte LV. Both LV (source and snapshot) must be an the same volume group

Remove a node from GPFS cluster:

  • Remove disks that belong to the server you want to remove
[root@labo_2_new]/root# mmlsnsd -m

 Disk name    NSD volume ID      Device         Node name                Remarks       
---------------------------------------------------------------------------------------
 diskh1       AC131C344EE5D6E8   /dev/hdisk2    labo_1                 server node
 diskh1       AC131C344EE5D6E8   /dev/hdisk2    labo_2                 server node
 diskh2       AC131C344EE600F2   /dev/hdisk4    labo_1                 server node
 diskh2       AC131C344EE600F2   /dev/hdisk4    labo_2                 server node
 diskk1       AC131C364EE5D6EC   /dev/descgpfs1lv labo_s                 server node
 diskk2       AC131C364EE600F6   /dev/descgpfs2lv labo_s                 server node
 diskr1       AC131C344EE5D6EA   /dev/hdisk3    labo_1                 server node
 diskr1       AC131C344EE5D6EA   /dev/hdisk3    labo_2                 server node
 diskr2       AC131C344EE600F4   /dev/hdisk5    labo_1                 server node
 diskr2       AC131C344EE600F4   /dev/hdisk5    labo_2                 server node

[root@labo_2_new]/root# mmlspv


[root@labo_2_new]/root# mmlsdisk orafs1
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh1       nsd         512       1 yes      yes   ready         up           system       
diskr1       nsd         512       2 yes      yes   ready         up           system       
diskk1       nsd         512       3 no       no    ready         up           system       

[root@labo_2_new]/root# mmdeldisk orafs1 diskk1
Deleting disks ...
Scanning system storage pool
Scanning file system metadata, phase 1 ... 
Scan completed successfully.
Scanning file system metadata, phase 2 ... 
Scan completed successfully.
Scanning file system metadata, phase 3 ... 
Scan completed successfully.
Scanning file system metadata, phase 4 ... 
Scan completed successfully.
Scanning user file metadata ...
 100.00 % complete on Fri Jan  6 09:52:41 2012
Scan completed successfully.
Checking Allocation Map for storage pool 'system'
tsdeldisk completed.
mmdeldisk: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.

[root@labo_2_new]/root# mmlsdisk orafs1
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh1       nsd         512       1 yes      yes   ready         up           system       
diskr1       nsd         512       2 yes      yes   ready         up           system       

[root@labo_2_new]/root# mmlsdisk orafs2
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh2       nsd         512       4 yes      yes   ready         up           system       
diskr2       nsd         512       5 yes      yes   ready         up           system       
diskk2       nsd         512       6 no       no    ready         up           system       
[root@labo_2_new]/root# mmdeldisk orafs2 diskk2
Deleting disks ...
Scanning system storage pool
Scanning file system metadata, phase 1 ... 
Scan completed successfully.
Scanning file system metadata, phase 2 ... 
Scan completed successfully.
Scanning file system metadata, phase 3 ... 
Scan completed successfully.
Scanning file system metadata, phase 4 ... 
Scan completed successfully.
Scanning user file metadata ...
 100.00 % complete on Fri Jan  6 09:55:30 2012
Scan completed successfully.
Checking Allocation Map for storage pool 'system'
tsdeldisk completed.
mmdeldisk: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@labo_2_new]/root# mmlsdisk orafs2
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh2       nsd         512       4 yes      yes   ready         up           system       
diskr2       nsd         512       5 yes      yes   ready         up           system       
  • Remove NSD that belong to the server you want to remove
[root@labo_2_new]/root# mmlsnsd -m

 Disk name    NSD volume ID      Device         Node name                Remarks       
---------------------------------------------------------------------------------------
 diskh1       AC131C344EE5D6E8   /dev/hdisk2    labo_1                 server node
 diskh1       AC131C344EE5D6E8   /dev/hdisk2    labo_2                 server node
 diskh2       AC131C344EE600F2   /dev/hdisk4    labo_1                 server node
 diskh2       AC131C344EE600F2   /dev/hdisk4    labo_2                 server node
 diskk1       AC131C364EE5D6EC   /dev/descgpfs1lv labo_s                 server node
 diskk2       AC131C364EE600F6   /dev/descgpfs2lv labo_s                 server node
 diskr1       AC131C344EE5D6EA   /dev/hdisk3    labo_1                 server node
 diskr1       AC131C344EE5D6EA   /dev/hdisk3    labo_2                 server node
 diskr2       AC131C344EE600F4   /dev/hdisk5    labo_1                 server node
 diskr2       AC131C344EE600F4   /dev/hdisk5    labo_2                 server node


[root@labo_2_new]/root# mmdelnsd "diskk1;diskk2"
mmdelnsd: Processing disk diskk1
mmdelnsd: Processing disk diskk2
mmdelnsd: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@labo_2_new]/root# mmlsnsd -m

 Disk name    NSD volume ID      Device         Node name                Remarks       
---------------------------------------------------------------------------------------
 diskh1       AC131C344EE5D6E8   /dev/hdisk2    labo_1                 server node
 diskh1       AC131C344EE5D6E8   /dev/hdisk2    labo_2                 server node
 diskh2       AC131C344EE600F2   /dev/hdisk4    labo_1                 server node
 diskh2       AC131C344EE600F2   /dev/hdisk4    labo_2                 server node
 diskr1       AC131C344EE5D6EA   /dev/hdisk3    labo_1                 server node
 diskr1       AC131C344EE5D6EA   /dev/hdisk3    labo_2                 server node
 diskr2       AC131C344EE600F4   /dev/hdisk5    labo_1                 server node
 diskr2       AC131C344EE600F4   /dev/hdisk5    labo_2                 server node
  • Now The server is still member from GPFS cluster, but without resources:
  • Stop GPFS on the member to remove
[root@labo_s_new]/root# mmshutdown
Fri Jan  6 10:03:32 CET 2012: mmshutdown: Starting force unmount of GPFS file systems
Fri Jan  6 10:03:37 CET 2012: mmshutdown: Shutting down GPFS daemons
Shutting down!
'shutdown' command about to kill process 17816
Fri Jan  6 10:03:42 CET 2012: mmshutdown: Finished
  • Remove the member from GPFS
[root@labo_2_new]/root# mmlscluster

GPFS cluster information
========================
  GPFS cluster name:         gpfsOracle.labo_2
  GPFS cluster id:           12399285214363632796
  GPFS UID domain:           gpfsOracle.labo_2
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp

GPFS cluster configuration servers:
-----------------------------------
  Primary server:    labo_2
  Secondary server:  labo_1

 Node  Daemon node name            IP address       Admin node name             Designation    
-----------------------------------------------------------------------------------------------
   1   labo_1                    10.10.10.52     labo_1                    quorum
   2   labo_2                    10.10.10.53     labo_2                    quorum
   3   labo_s                    10.10.10.54     labo_s                    quorum

[root@labo_2_new]/root# mmgetstate -aLs

 Node number  Node name       Quorum  Nodes up  Total nodes  GPFS state  Remarks    
------------------------------------------------------------------------------------
       1      labo_1           2        2          3       active      quorum node
       2      labo_2           2        2          3       active      quorum node
       3      labo_s           0        0          3       down        quorum node

 Summary information 
---------------------
Number of nodes defined in the cluster:            3
Number of local nodes active in the cluster:       2
Number of remote nodes joined in this cluster:     0
Number of quorum nodes defined in the cluster:     3
Number of quorum nodes active in the cluster:      2
Quorum = 2, Quorum achieved

[root@labo_2_new]/root# mmdelnode -N labo_s
Verifying GPFS is stopped on all affected nodes ...
mmdelnode: Command successfully completed
mmdelnode: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@labo_2_new]/root# mmlscluster

GPFS cluster information
========================
  GPFS cluster name:         gpfsOracle.labo_2
  GPFS cluster id:           12399285214363632796
  GPFS UID domain:           gpfsOracle.labo_2
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp

GPFS cluster configuration servers:
-----------------------------------
  Primary server:    labo_2
  Secondary server:  labo_1

 Node  Daemon node name            IP address       Admin node name             Designation    
-----------------------------------------------------------------------------------------------
   1   labo_1                    10.10.10.52     labo_1                    quorum
   2   labo_2                    10.10.10.53     labo_2                    quorum

[root@labo_2_new]/root# mmgetstate -aLs

 Node number  Node name       Quorum  Nodes up  Total nodes  GPFS state  Remarks    
------------------------------------------------------------------------------------
       1      labo_1           2        2          2       active      quorum node
       2      labo_2           2        2          2       active      quorum node

 Summary information 
---------------------
Number of nodes defined in the cluster:            2
Number of local nodes active in the cluster:       2
Number of remote nodes joined in this cluster:     0
Number of quorum nodes defined in the cluster:     2
Number of quorum nodes active in the cluster:      2
Quorum = 2, Quorum achieved

Add a node to GPFS cluster:

  • Add a new node to the GPFS cluster: add first the node as nonquorum, and then change it to quorum (else you need to stop the cluster
[root@labo_2_new]/root# mmlscluster

GPFS cluster information
========================
  GPFS cluster name:         gpfsOracle.labo_2
  GPFS cluster id:           12399285214363632796
  GPFS UID domain:           gpfsOracle.labo_2
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp

GPFS cluster configuration servers:
-----------------------------------
  Primary server:    labo_2
  Secondary server:  labo_1

 Node  Daemon node name            IP address       Admin node name             Designation    
-----------------------------------------------------------------------------------------------
   1   labo_1                    10.10.10.52     labo_1                    quorum
   2   labo_2                    10.10.10.53     labo_2                    quorum

[root@labo_2_new]/root# mmaddnode -N labo_s:nonquorum
Fri Jan  6 12:37:14 CET 2012: mmaddnode: Processing node labo_s
mmaddnode: Command successfully completed
mmaddnode: Warning: Not all nodes have proper GPFS license designations.
    Use the mmchlicense command to designate licenses as needed.
mmaddnode: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@labo_2_new]/root# mmlscluster

===============================================================================
| Warning:                                                                    |
|   This cluster contains nodes that do not have a proper GPFS license        |
|   designation.  This violates the terms of the GPFS licensing agreement.    |
|   Use the mmchlicense command and assign the appropriate GPFS licenses      |
|   to each of the nodes in the cluster.  For more information about GPFS     |
|   license designation, see the Concepts, Planning, and Installation Guide.  |
===============================================================================


GPFS cluster information
========================
  GPFS cluster name:         gpfsOracle.labo_2
  GPFS cluster id:           12399285214363632796
  GPFS UID domain:           gpfsOracle.labo_2
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp

GPFS cluster configuration servers:
-----------------------------------
  Primary server:    labo_2
  Secondary server:  labo_1

 Node  Daemon node name            IP address       Admin node name             Designation    
-----------------------------------------------------------------------------------------------
   1   labo_1                    10.10.10.52     labo_1                    quorum
   2   labo_2                    10.10.10.53     labo_2                    quorum
   3   labo_s                    10.10.10.54     labo_s                    

[root@labo_2_new]/root# mmchlicense server --accept -N labo_s

The following nodes will be designated as possessing GPFS server licenses:
        labo_s
mmchlicense: Command successfully completed
mmchlicense: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.

[root@labo_2_new]/root# mmchnode --quorum -N labo_s
Fri Jan  6 12:39:26 CET 2012: mmchnode: Processing node labo_s
mmchnode: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@labo_2_new]/root# mmlscluster                      

GPFS cluster information
========================
  GPFS cluster name:         gpfsOracle.labo_2
  GPFS cluster id:           12399285214363632796
  GPFS UID domain:           gpfsOracle.labo_2
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp

GPFS cluster configuration servers:
-----------------------------------
  Primary server:    labo_2
  Secondary server:  labo_1

 Node  Daemon node name            IP address       Admin node name             Designation    
-----------------------------------------------------------------------------------------------
   1   labo_1                    10.10.10.52     labo_1                    quorum
   2   labo_2                    10.10.10.53     labo_2                    quorum
   3   labo_s                    10.10.10.54     labo_s                    quorum
[root@labo_2_new]/root# mmgetstate -aLs

 Node number  Node name       Quorum  Nodes up  Total nodes  GPFS state  Remarks    
------------------------------------------------------------------------------------
       1      labo_1           2        2          3       active      quorum node
       2      labo_2           2        2          3       active      quorum node
       3      labo_s           0        0          3       down        quorum node

 Summary information 
---------------------
Number of nodes defined in the cluster:            3
Number of local nodes active in the cluster:       2
Number of remote nodes joined in this cluster:     0
Number of quorum nodes defined in the cluster:     3
Number of quorum nodes active in the cluster:      2
Quorum = 2, Quorum achieved
  • Start GPFS on the new node:
[root@labo_2_new]/root# mmgetstate -aLs

 Node number  Node name       Quorum  Nodes up  Total nodes  GPFS state  Remarks    
------------------------------------------------------------------------------------
       1      labo_1           2        2          3       active      quorum node
       2      labo_2           2        2          3       active      quorum node
       3      labo_s           0        0          3       down        quorum node

 Summary information 
---------------------
Number of nodes defined in the cluster:            3
Number of local nodes active in the cluster:       2
Number of remote nodes joined in this cluster:     0
Number of quorum nodes defined in the cluster:     3
Number of quorum nodes active in the cluster:      2
Quorum = 2, Quorum achieved

[root@labo_s_new]/root# mmstartup
Fri Jan  6 12:40:45 CET 2012: mmstartup: Starting GPFS ...

[root@labo_2_new]/root# mmgetstate -aLs

 Node number  Node name       Quorum  Nodes up  Total nodes  GPFS state  Remarks    
------------------------------------------------------------------------------------
       1      labo_1           2        3          3       active      quorum node
       2      labo_2           2        3          3       active      quorum node
       3      labo_s           2        3          3       active      quorum node

 Summary information 
---------------------
Number of nodes defined in the cluster:            3
Number of local nodes active in the cluster:       3
Number of remote nodes joined in this cluster:     0
Number of quorum nodes defined in the cluster:     3
Number of quorum nodes active in the cluster:      3
Quorum = 2, Quorum achieved
  • Create NSD description files,and create the NSD
[root@labo_2_new]/root# cat gpfsk_disk1
/dev/descgpfs1lv:labo_s::descOnly:3:diskk1
[root@labo_2_new]/root# mmcrnsd -F gpfsk_disk1    
[root@labo_2_new]/root# cat gpfsk_disk2
/dev/descgpfs2lv:labo_s::descOnly:6:diskk2
[root@labo_2_new]/root# mmcrnsd -F gpfsk_disk2

[root@labo_2_new]/root# mmlsnsd

 File system   Disk name    NSD servers                                    
---------------------------------------------------------------------------
 orafs1       diskh1       labo_1,labo_2        
 orafs1       diskr1       labo_1,labo_2        
 orafs2       diskh2       labo_1,labo_2        
 orafs2       diskr2       labo_1,labo_2        
 (free disk)   diskk1       labo_s                 
 (free disk)   diskk2       labo_s       
  • Add the new disks into the filesystem, and restripe (-r)
[root@labo_1_new]/kondor# mmlsdisk orafs1
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh1       nsd         512       1 yes      yes   ready         up           system       
diskr1       nsd         512       2 yes      yes   ready         up           system       

[root@labo_2_new]/root# mmadddisk orafs1 -F gpfsk_disk1 -r

The following disks of orafs1 will be formatted on node labo_2:
    diskk1: size 163840 KB
Extending Allocation Map
Checking Allocation Map for storage pool 'system'
Completed adding disks to file system orafs1.
mmadddisk: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
Restriping orafs1 ...
Scanning file system metadata, phase 1 ... 
Scan completed successfully.
Scanning file system metadata, phase 2 ... 
Scan completed successfully.
Scanning file system metadata, phase 3 ... 
Scan completed successfully.
Scanning file system metadata, phase 4 ... 
Scan completed successfully.
Scanning user file metadata ...
 100.00 % complete on Fri Jan  6 14:19:56 2012
Scan completed successfully.
Done

[root@labo_1_new]/kondor# mmlsdisk orafs1
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh1       nsd         512       1 yes      yes   ready         up           system       
diskr1       nsd         512       2 yes      yes   ready         up           system       
diskk1       nsd         512       3 no       no    ready         up           system     

[root@labo_1_new]/kondor# mmlsdisk orafs2
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh2       nsd         512       4 yes      yes   ready         up           system       
diskr2       nsd         512       5 yes      yes   ready         up           system       

[root@labo_2_new]/root# mmadddisk orafs2 -F gpfsk_disk2 -r

The following disks of orafs2 will be formatted on node labo_1:
    diskk2: size 163840 KB
Extending Allocation Map
Checking Allocation Map for storage pool 'system'
Completed adding disks to file system orafs2.
mmadddisk: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
Restriping orafs2 ...
Scanning file system metadata, phase 1 ... 
Scan completed successfully.
Scanning file system metadata, phase 2 ... 
Scan completed successfully.
Scanning file system metadata, phase 3 ... 
Scan completed successfully.
Scanning file system metadata, phase 4 ... 
Scan completed successfully.
Scanning user file metadata ...
 100.00 % complete on Fri Jan  6 14:21:17 2012
Scan completed successfully.
Done

[root@labo_1_new]/kondor# mmlsdisk orafs2
disk         driver   sector failure holds    holds                            storage
name         type       size   group metadata data  status        availability pool
------------ -------- ------ ------- -------- ----- ------------- ------------ ------------
diskh2       nsd         512       4 yes      yes   ready         up           system       
diskr2       nsd         512       5 yes      yes   ready         up           system       
diskk2       nsd         512       6 no       no    ready         up           system       
  • Change the parameter to prevent the new node to umount the filesystem in case of failure:
[root@labo_2_new]/root# mmlsconfig
Configuration data for cluster gpfsOracle.labo_2:
---------------------------------------------------
clusterName gpfsOracle.labo_2
clusterId 12399285214363632796
autoload yes
minReleaseLevel 3.3.0.2
dmapiFileHandleSize 32
unmountOnDiskFail no
maxMBpS 300
pagepool 256M
adminMode central

File systems in cluster gpfsOracle.labo_2:
--------------------------------------------
/dev/orafs1
/dev/orafs2
[root@labo_2_new]/root# mmchconfig unmountOnDiskFail=yes labo_s     
mmchconfig: Command successfully completed
mmchconfig: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.
[root@labo_2_new]/root# mmlsconfig
Configuration data for cluster gpfsOracle.labo_2:
---------------------------------------------------
clusterName gpfsOracle.labo_2
clusterId 12399285214363632796
autoload yes
minReleaseLevel 3.3.0.2
dmapiFileHandleSize 32
unmountOnDiskFail no
[labo_s]
unmountOnDiskFail yes
[common]
maxMBpS 300
pagepool 256M
adminMode central

File systems in cluster gpfsOracle.labo_2:
--------------------------------------------
/dev/orafs1
/dev/orafs2
gpfs/gpfs_operation.1611736853.txt.gz · Last modified: 2021/01/27 09:40 by manu