wiki

Protocols / CES

CES share and enable

Creation of a file system or fileset or path for a CES shared root, and creation of an object fileset The installation toolkit uses a shared root storage area to install the protocols on each node. This storage is also used by NFS and object protocols to maintain system data associated with the cluster integration we provide. This storage can be a subdirectory in an existing GPFS file system or it can be a filesystem on its own. Once this option is set, changing it will requires a restart of GPFS.

1. Create a file system or fileset for shared root. **Size must be at least 4 GB.**
2. Use the following command:

--ccr-enable

mmchconfig cesSharedRoot=path_to_the_filesystem/fileset_created_in_step_1

For Object, the installation toolkit creates an independent fileset in the GPFS file system that you name.

[root@gpfs01 ~]# mkdir /gpfs01/.cesSharedRoot
[root@gpfs01 ~]# ls -lsa /gpfs01
  4 drwxr-xr-x   2 root root   4096 12 juin  14:56 .cesSharedRoot
  1 dr-xr-xr-x   2 root root   8192  1 janv.  1970 .snapshots
[root@gpfs01 ~]# mmchconfig cesSharedRoot=/gpfs01/.cesSharedRoot

[root@gpfs01 ~]# mmlsconfig 
Configuration data for cluster gpfs01.cluster:
----------------------------------------------
clusterName gpfs01.cluster
clusterId 17066707964194168573
autoload no
uidDomain GPFS
dmapiFileHandleSize 32
minReleaseLevel 5.0.0.0
tiebreakerDisks GPFS_NSD_DATA01
cesSharedRoot /gpfs01/.cesSharedRoot
adminMode central

File systems in cluster gpfs01.cluster:
---------------------------------------
/dev/gpfs01lv
[root@gpfs01 ~]# mmlscluster

GPFS cluster information
========================
  GPFS cluster name:         gpfs01.cluster
  GPFS cluster id:           17066707964194168573
  GPFS UID domain:           GPFS
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
  Repository type:           server-based

GPFS cluster configuration servers:
-----------------------------------
  Primary server:    gpfs01
  Secondary server:  (none)

 Node  Daemon node name  IP address   Admin node name  Designation
-------------------------------------------------------------------
   1   gpfs01           10.10.105.10  gpfs01           quorum-manager

[root@gpfs01 ~]# mmchcluster --ccr-enable 

[root@gpfs01 ~]# mmlscluster | grep Repo
  Repository type:           CCR

enable SMB / NFS

[root@gpfs01 ~]# yum -y install gpfs.smb nfs-utils nfs-ganesha-gpfs nfs-ganesha

[root@gpfs01 ~]# systemctl mask nfs-server.service
Created symlink from /etc/systemd/system/nfs-server.service to /dev/null.
[root@gpfs01 ~]# systemctl stop nfs

Enable CES for nodes

[root@gpfs01 ~]# mmchnode --ces-enable -N gpfs01,gpfs02
Fri Sep 30 17:12:30 CEST 2016: mmchnode: Processing node gpfs01
Fri Sep 30 17:12:50 CEST 2016: mmchnode: Processing node gpfs02
mmchnode: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.

[root@gpfs01 ~]# mmlscluster

GPFS cluster information
========================
  GPFS cluster name:         gpfs_test.rhlabh1
  GPFS cluster id:           9668046452208786064
  GPFS UID domain:           gpfs_test.rhlabh1
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
  Repository type:           CCR

 Node  Daemon node name  IP address     Admin node name  Designation
---------------------------------------------------------------------
   1   gpfs01           10.10.10.103  gpfs01          quorum-manager-perfmon
   2   gpfs02           10.10.10.104  gpfs02          quorum-manager-perfmon

[root@gpfs01 ~]# mmces service enable NFS
[root@gpfs01 ~]# mmces service enable SMB
[root@gpfs01 ~]# mmlscluster --ces

GPFS cluster information
========================
  GPFS cluster name:         gpfs_test.rhlabh1
  GPFS cluster id:           9668046452208786064

Cluster Export Services global parameters
-----------------------------------------
  Shared root directory:                /gpfs1
  Enabled Services:                     NFS SMB
  Log level:                            0
  Address distribution policy:          even-coverage

 Node  Daemon node name            IP address       CES IP address list
-----------------------------------------------------------------------
   1   gpfs01                     10.10.10.103    None
   2   gpfs02                     10.10.10.104    None
[root@gpfs01 ~]# mmces service list --all
Enabled services: NFS SMB
gpfs01:  NFS is running, SMB is running
gpfs02:  NFS is running, SMB is running

mmces service start SMB -a
mmces service start NFS -a

After you start the protocol services, verify that they are running by issuing the

[root@gpfs01 ~]# mmces state show -a
NODE        AUTH       BLOCK      NETWORK   AUTH_OBJ   NFS       OBJ        SMB       CES
gpfs01      DISABLED   DISABLED   HEALTHY   DISABLED   HEALTHY   DISABLED   HEALTHY   HEALTHY
gpfs02      DISABLED   DISABLED   HEALTHY   DISABLED   HEALTHY   DISABLED   HEALTHY   HEALTHY

Add IP address for cluster NFS and CIFS

[root@gpfs01 ~]# mmces address add --ces-ip gpfs01-nfs
[root@gpfs01 ~]# mmces address add --ces-ip gpfs02-cifs

CES address failover and distribution policies

Add IP address for cluster NFS and CIFS

# mmces address policy [even-coverage | balanced-load | node-affinity | none]

even-coverage

Distributes the addresses among the available nodes. The even-coverage policy is the default address distribution policy.

balanced-load

Distributes the addresses to approach an optimized load distribution. The load (network and CPU) on all the nodes are monitored. Addresses are moved based on given policies for optimized load throughout the cluster.

node-affinity

Attempts to keep an address on the node to which the user manually assigned it. If the mmces address add command is used with the –node option, the address is marked as being associated with that node. Similarly, if an address is moved with the mmces address move command, the address is marked as being associated with the destination node. Any automatic movement, such as reassigning a down node's addresses, does not change this association. Addresses that are enabled with no node specification do not have a node association.

Addresses that are associated with a node but assigned to a different node are moved back to the associated node if possible.

Force the balance

# mmces address move --rebalance

Or

# mmces address move --ces-ip {IP[,IP...]} --ces-node Node

# mmces address list --full-list
cesAddress    cesNode                  attributes                                   cesGroup     cesPrefix   preferredNode            unhostableNodes
------------- ------------------------ -------------------------------------------- ------------ ----------- ------------------------ -----------------
172.128.1.171   gpfsa01.mydom.lu   object_database_node,object_singleton_node   nfsgroup01   none        gpfsa01.mydom.lu   none
172.128.1.172   gpfsb01.mydom.lu   none                                         nfsgroup01   none        gpfsb01.mydom.lu   none

https://www.ibm.com/docs/en/storage-scale/5.1.8?topic=reference-mmces-command

configure authentification

Here only local authentification, so users creation must be done on all cluster nodes. Support also LDAP, AD,…

[root@gpfs01 ~]# mmuserauth service list
FILE access not configured
PARAMETERS               VALUES
-------------------------------------------------

OBJECT access not configured
PARAMETERS               VALUES
-------------------------------------------------
[root@gpfs01 ~]# mmuserauth service create --data-access-method file --type userdefined
File authentication configuration completed successfully.
[root@gpfs01 ~]# mmuserauth service list
FILE access configuration : USERDEFINED
PARAMETERS               VALUES                   
-------------------------------------------------

OBJECT access not configured
PARAMETERS               VALUES                   
-------------------------------------------------

Configure shares

[root@gpfs01 ~]# mmnfs export add '/gpfs01/backupdb' -c '10.1.0.0/16(Access_Type=RW,squash=root_squash,protocols=3:4)'

Install performance monitoring

[root@gpfs01 ~]# yum -y install gpfs.gss.pmcollector gpfs.gss.pmsensors gpfs.pm-ganesha
[root@gpfs01 ~]# systemctl enable pmsensors.service 
[root@gpfs01 ~]# systemctl start pmsensors.service 
[root@gpfs01 ~]# systemctl enable pmcollector.service
[root@gpfs01 ~]# systemctl start pmcollector.service

Configure performance monitoring

Now configure the PM_SENSORS for performance monitoring

[root@gpfs01 ~]#  mmperfmon config generate --collectors gpfs01-hb,gpfs02-hb
mmperfmon: Node gpfs01-hb is not a perfmon node.
mmperfmon: Node gpfs02-hb is not a perfmon node.
mmperfmon: Propagating the cluster configuration data to all
  affected nodes.  This is an asynchronous process.

Test it

[root@gpfs01 ~]#  /usr/lpp/mmfs/gui/cli/runtask PM_SENSORS  --debug
debug: locale=en_US
debug: Running 'mmperfmon config show ' on node localhost
debug: Reading output of  'mmperfmon config show'
debug: Parsed data for 48 sensors
debug: syncDb(): new/changed/unchanged/deleted 0/48/0/0
debug: Running 'mmsysmonc event 'gui' 'gui_refresh_task_successful'  ' on node localhost
EFSSG1000I The command completed successfully.

Show the config

[root@gpfs01 ~]# mmperfmon config show
# This file has been generated automatically and SHOULD NOT
# be edited manually.  It may be overwritten at any point
# in time.

cephMon = "/opt/IBM/zimon/CephMonProxy"
cephRados = "/opt/IBM/zimon/CephRadosProxy"
colCandidates = "gpfs01-hb", "gpfs02-hb"
colRedundancy = 1
collectors = {
        host = ""
        port = "4739"
}
config = "/opt/IBM/zimon/ZIMonSensors.cfg"
ctdbstat = ""
daemonize = T
hostname = ""
ipfixinterface = "0.0.0.0"
logfile = "/var/log/zimon/ZIMonSensors.log"
loglevel = "info"
mmcmd = "/opt/IBM/zimon/MMCmdProxy"
mmdfcmd = "/opt/IBM/zimon/MMDFProxy"
mmpmon = "/opt/IBM/zimon/MmpmonSockProxy"
piddir = "/var/run"
release = "5.0.1-1"
sensors = {
        name = "CPU"
        period = 1
},
{
        name = "Load"
        period = 1
},
...

Install GUI

[root@gpfs01 ~]# yum -y install postgres postgres-libs postgres-server
[root@gpfs01 ~]# yum -y install gpfs.gui gpfs.java
[root@gpfs01 ~]# systemctl enable gpfsgui
[root@gpfs01 ~]# systemctl start gpfsgui

Now you are ready to use https://gpfs01/

user: admin / admin001

NFS more detailled

NFS export file is located:

[root@gpfs01 ~]# cat /var/mmfs/ces/nfs-config/gpfs.ganesha.exports.conf

Show export options

[root@gpfs01 ~]# mmnfs export list -Y
mmcesnfslsexport:nfsexports:HEADER:version:reserved:reserved:Path:Delegations:Clients:
mmcesnfslsexport:nfsexports:0:1:::/gpfs01:NONE:*:
mmcesnfslsexport:nfsexports:0:1:::/gpfs01/backupdb:NONE:10.0.105.0/24:

Remove a share

[root@gpfs01 ~]# mmnfs export remove '/gpfs01'

List NFS config

[root@gpfs01 ~]# mmnfs config list

NFS Ganesha Configuration:
==========================
NFS_PROTOCOLS: 3,4
NFS_PORT: 2049
MNT_PORT: 0
NLM_PORT: 0
RQUOTA_PORT: 0
NB_WORKER: 256
LEASE_LIFETIME: 60
GRACE_PERIOD: 60
DOMAINNAME: VIRTUAL1.COM
DELEGATIONS: Disabled
==========================

STATD Configuration 
==========================
STATD_PORT: 0
==========================

CacheInode Configuration 
==========================
ENTRIES_HWMARK: 1500000
==========================

Export Defaults 
==========================
ACCESS_TYPE: NONE
PROTOCOLS: 3,4
TRANSPORTS: TCP
ANONYMOUS_UID: -2
ANONYMOUS_GID: -2
SECTYPE: SYS
PRIVILEGEDPORT: FALSE
MANAGE_GIDS: FALSE
SQUASH: ROOT_SQUASH
NFS_COMMIT: FALSE
==========================

Log Configuration
==========================
LOG_LEVEL: EVENT
==========================

Idmapd Configuration 
==========================
LOCAL-REALMS: localdomain
DOMAIN: localdomain
==========================

SMB more detailled

SMB config:

[root@gpfs01 ~]# mmsmb config list
SMB option                        value                                                                      
                                                                                                             
add share command                 /usr/lpp/mmfs/bin/mmcesmmccrexport                                         
aio read size                     1                                                                          
aio write size                    1                                                                          
aio_pthread:aio open              yes                                                                        
auth methods                      guest sam winbind                                                          
change notify                     yes                                                                        
change share command              /usr/lpp/mmfs/bin/mmcesmmcchexport                                         
client NTLMv2 auth                yes                                                                        
ctdb locktime warn threshold      5000                                                                       
ctdb:smbxsrv_open_global.tdb      false                                                                      
debug hires timestamp             yes                                                                        
delete share command              /usr/lpp/mmfs/bin/mmcesmmcdelexport                                        
dfree cache time                  100                                                                        
disable netbios                   yes                                                                        
disable spoolss                   yes                                                                        
dmapi support                     no                                                                         
durable handles                   no                                                                         
ea support                        yes                                                                        
fileid:algorithm                  fsname                                                                     
fileid:fstype allow               gpfs                                                                       
force unknown acl user            yes                                                                        
fruit:metadata                    stream                                                                     
fruit:nfs_aces                    no                                                                         
fruit:veto_appledouble            no                                                                         
gencache:stabilize_count          10000                                                                      
gpfs:dfreequota                   yes                                                                        
gpfs:hsm                          yes                                                                        
gpfs:leases                       yes                                                                        
gpfs:merge_writeappend            no                                                                         
gpfs:prealloc                     yes                                                                        
gpfs:sharemodes                   yes                                                                        
gpfs:winattr                      yes                                                                        
groupdb:backend                   tdb                                                                        
host msdfs                        yes                                                                        
idmap config * : backend          autorid                                                                    
idmap config * : range            10000000-299999999                                                         
idmap config * : rangesize        1000000                                                                    
idmap config * : read only        no                                                                         
idmap:cache                       no                                                                         
include system krb5 conf          no                                                                         
kernel oplocks                    no                                                                         
large readwrite                   yes                                                                        
level2 oplocks                    yes                                                                        
log level                         1                                                                          
log writeable files on exit       yes                                                                        
logging                           syslog@0 file                                                              
mangled names                     illegal                                                                    
map archive                       yes                                                                        
map hidden                        yes                                                                        
map readonly                      yes                                                                        
map system                        yes                                                                        
max log size                      100000                                                                     
max open files                    20000                                                                      
nfs4:acedup                       merge                                                                      
nfs4:chown                        yes                                                                        
nfs4:mode                         simple                                                                     
notify:inotify                    yes                                                                        
passdb backend                    tdbsam                                                                     
password server                   *                                                                          
posix locking                     no                                                                         
preferred master                  no                                                                         
printcap cache time               0                                                                          
read only                         no                                                                         
readdir_attr:aapl_max_access      false                                                                      
security                          user                                                                       
server max protocol               SMB3_02                                                                    
server min protocol               SMB2_02                                                                    
server string                     IBM NAS                                                                    
shadow:fixinodes                  yes                                                                        
shadow:snapdir                    .snapshots                                                                 
shadow:snapdirseverywhere         yes                                                                        
shadow:sort                       desc                                                                       
smbd exit on ip drop              yes                                                                        
smbd profiling level              on                                                                         
smbd:async search ask sharemode   yes                                                                        
smbd:backgroundqueue              False                                                                      
socket options                    TCP_NODELAY SO_KEEPALIVE TCP_KEEPCNT=4 TCP_KEEPIDLE=240 TCP_KEEPINTVL=15   
store dos attributes              yes                                                                        
strict allocate                   yes                                                                        
strict locking                    auto                                                                       
syncops:onmeta                    no                                                                         
tdbsam:map builtin                no                                                                         
time_audit:timeout                5000                                                                       
unix extensions                   no                                                                         
use sendfile                      no                                                                         
vfs objects                       shadow_copy2 syncops gpfs fileid time_audit                                
wide links                        no                                                                         
winbind max clients               10000                                                                      
winbind max domain connections    5                                                                          
winbind:online check timeout      30

SMB export list:

[root@gpfs01 ~]# mmsmb export list
export   path            browseable   guest ok   smb encrypt   
samba    /gpfs01/samba   yes          no         auto

Authentification local

Create a local user on all GPFS nodes and SMB user

[root@prscale-a-01 ces]# /usr/lpp/mmfs/bin/smbpasswd -a gpfsveeam01
New SMB password:
Retype new SMB password:
Added user gpfsveeam01.

[root@prscale-b-01 ~]# groupadd -g 10000001 gpfsveeam01
[root@prscale-b-01 ~]# useradd -c "user connect veeam" -M  -u 10000001 -g 10000001    -s /sbin/nologin gpfsveeam01

So you can assign UID to shared SMB folder

CTDB state

[root@prscale-a-01 log]# ctdb --debug=3 -v status
Number of nodes:2
pnn:0 10.255.7.11      OK
pnn:1 10.255.7.10      OK (THIS NODE)
Generation:1948871161
Size:2
hash:0 lmaster:0
hash:1 lmaster:1
Recovery mode:NORMAL (0)
Recovery master:0
[root@prscale-a-01 log]# ctdb --debug=3 -v getdbmap
Number of databases:16
dbid:0x3ef19640 name:passdb.tdb path:/var/lib/ctdb/persistent/passdb.tdb.1 PERSISTENT
dbid:0x2ca251cf name:account_policy.tdb path:/var/lib/ctdb/persistent/account_policy.tdb.1 PERSISTENT
dbid:0xa1413774 name:group_mapping.tdb path:/var/lib/ctdb/persistent/group_mapping.tdb.1 PERSISTENT
dbid:0xc3078fba name:share_info.tdb path:/var/lib/ctdb/persistent/share_info.tdb.1 PERSISTENT
dbid:0x06916e77 name:leases.tdb path:/var/lib/ctdb/volatile/leases.tdb.1
dbid:0x83b22c33 name:share_entries.tdb path:/var/lib/ctdb/volatile/share_entries.tdb.1
dbid:0x7a19d84d name:locking.tdb path:/var/lib/ctdb/volatile/locking.tdb.1
dbid:0x4e66c2b2 name:brlock.tdb path:/var/lib/ctdb/volatile/brlock.tdb.1
dbid:0x68c12c2c name:smbXsrv_tcon_global.tdb path:/var/lib/ctdb/volatile/smbXsrv_tcon_global.tdb.1
dbid:0x6b06a26d name:smbXsrv_session_global.tdb path:/var/lib/ctdb/volatile/smbXsrv_session_global.tdb.1
dbid:0x477d2e20 name:smbXsrv_client_global.tdb path:/var/lib/ctdb/volatile/smbXsrv_client_global.tdb.1
dbid:0x521b7544 name:smbXsrv_version_global.tdb path:/var/lib/ctdb/volatile/smbXsrv_version_global.tdb.1
dbid:0x7132c184 name:secrets.tdb path:/var/lib/ctdb/persistent/secrets.tdb.1 PERSISTENT
dbid:0x4d2a432b name:g_lock.tdb path:/var/lib/ctdb/volatile/g_lock.tdb.1
dbid:0x6cf2837d name:registry.tdb path:/var/lib/ctdb/persistent/registry.tdb.1 PERSISTENT
dbid:0x6645c6c4 name:ctdb.tdb path:/var/lib/ctdb/persistent/ctdb.tdb.1 PERSISTENT

In case of failure check the IPs

[root@prscale-b-01 ~]# ctdb --debug=3 -v status
connect() failed, errno=111
Failed to connect to CTDB daemon (/var/run/ctdb/ctdbd.socket)
Failed to detect PNN of the current node.
Is this node part of CTDB cluster?
[root@prscale-b-01 ~]# cat "/usr/lpp/mmfs/lib/ctdb/nodes"
10.10.10.11
10.10.10.12

Debug protocols

[root@prscale-a-01 ~]# gpfs.snap --protocol authentication
gpfs.snap: started at Wed Dec  1 15:18:06 CET 2021.
Gathering common data...
Gathering Linux specific data...
Gathering extended network data...
Gathering local callhome data...
Gathering local perfmon data...
Gathering local msgqueue data...
Gathering local auth data...
Gathering local sysmon data...
Gathering local cnss data...
Gathering local gui data...
Gathering trace reports and internal dumps...
Gathering Transparent Cloud Tiering data at level BASIC...
The Transparent Cloud Tiering snap data collection completed for node prscale-a-01
Gathering QoS data at level FULL...
gpfs.snap:  No QoS configuration was found for this cluster.
gpfs.snap:  QoS configuration collection complete.
Gathering cluster wide gui data...
Gathering cluster wide sysmon data...
Gathering cluster wide cnss data...
Gathering cluster wide callhome data...
Gathering cluster wide perfmon data...
Gathering cluster wide msgqueue data...
gpfs.snap:  Spawning remote gpfs.snap calls. Master is prscale-a-01.
...
gpfs.snap completed at Wed Dec  1 15:21:33 CET 2021
###############################################################################
Send file /tmp/gpfs.snapOut/3243658/all.20211201151806.3243658.tar to IBM Service
Examine previous messages to determine additional required data.
###############################################################################

NFS tuning

https://www.ibm.com/docs/en/spectrum-scale/5.1.5?topic=dr-tuning-nfs-server-homesecondary-cluster-nfs-server