This is an old revision of the document!
Sizing Summary for a Small Cluster
A common starting point for a small, moderately performing virtual Storage Scale cluster might be:
vCPU: 8 vCPUs vRAM: 16 GB Network: 2 x 25 Gbps vNICs (bonded for I/O)
vCPU: 1 vCPUs vRAM: 64 GB or 128 GB if NFS ans SMB is used Network: 10 Gbps
| Feature | Data Access | Data Management1 | Erasure Code Edition |
|---|---|---|---|
| Multi-protocol scalable file service with simultaneous access to a common set of data | ✓ | ✓ | ✓ |
| Facilitate data access with a global namespace, massively scalable file system, quotas and snapshots, data integrity and availability, and filesets | ✓ | ✓ | ✓ |
| Simplify management with GUI | ✓ | ✓ | ✓ |
| Improved efficiency with QoS and compression | ✓ | ✓ | ✓ |
| Create optimized tiered storage pools based on performance, locality, or cost | ✓ | ✓ | ✓ |
| Simplify data management with Information Lifecycle Management (ILM) tools that include policy based data placement and migration | ✓ | ✓ | ✓ |
| Enable worldwide data access using AFM asynchronous replication | ✓ | ✓ | ✓ |
| Asynchronous multi-site Disaster Recovery | ✓ | ✓ | |
| Multi-site replication with AFM to cloud object storage | ✓ | ✓ | |
| Protect data with native software encryption and secure erase, NIST compliant and FIPS certified | ✓ | ✓ | |
| File audit logging | ✓ | ✓ | |
| Watch folder | ✓ | ✓ | |
| Erasure coding | ESS only | ESS only | ✓ |
Typically, metadata is between 1 and 5% of the filesystem space, but this can vary.
Depending on usage, you can have different block size
| IO Type | Application Examples | Blocksize |
|---|---|---|
| Large Sequential IO | Scientific Computing, Digital Media,File based Analytics | 1MB to 16MB |
| Relational Database | DB2, Oracle, Small Files on ESS | 512KiB |
| Small I/O Sequential | General File Service ,Email, Web Applications | 256KiB |
| Special* | Special | 16KB-64KiB |
| *Since GPFS 3.3 there are very few workloads that benefit from a file system blocksize of 16KiB or 64KiB. | ||
| Workload | Configuration Type | Blocksize |
|---|---|---|
| SAP HANA | ESS GL | 16MiB for Data |
| SAP HANA | FPO | 1MiB for single pool |
| Or | ||
| 256KiB for metadataOnly | ||
| 2MiB for dataOnly | ||
| Hadoop | ESS GL | 1MiB for metadataOnly |
| 8MiB for dataOnly | ||
| Hadoop | FPO | 256KiB for metadataOnly |
| 2MiB for dataonly | ||
| Spark | FPO | 256KiB for metadataOnly |
| 2MiB for dataOnly | ||
| SAP Sybase IQ | ESS GL | 256KiB-1MiB for metadataOnly |
| 16MiB for dataOnly | ||
| Healthcare (Medical Imaging) | ESS | 256KiB for metadataOnly |
| 1MiB for dataOnly | ||
| Healthcare (Medical Imaging) | Other Storage | 256KiB Metadata and data |
| Archive | Other Storage | Depends on storage and performance requirements |
| ECM | Other Storage | 256KiB Unless the content is very large files (Videos for example). |
| Oracle | Other Storage | 256KiB |
| Technical Computing | ESS GL | 1MIB Metadata |
| 4MiB - 16MiB Data depending on importance of peak sequential performance. | ||
| SAS | ESS | 1MiB MetadataOnly |
| 8MiB or 16MiB depending on the SASBUF size (128KiB or 256 KiB) | ||
| Enterprise File (Misc Projects, data sharing) | Other Storage | 256KiB metadata and data |
During NSD creation alternate node position for access to NSD. If the first node is used as first node in NSD definition, then it 'll be only use and you'll reach performance problems
If only the first node is selected as first into NSD definition means every NSD task ( read/ write … ) has to be handled by NSD server 'gpfs1' as long as he is reachable. Such a configuration could cause a overload situation on the affected server.
The NSD server sequence can be adjusted online via command mmchnsd ( see below ): https://www.ibm.com/docs/en/spectrum-scale/5.0.5?topic=disks-changing-your-nsd-configuration Ex:
# mmchnsd "data_nsd043:gpfs03.gpfsint.labo,gpfs04.gpfsint.labo,gpfs01.gpfsint.labo,gpfs02.gpfsint.labo"
Maybe easier with a description file
# mmlsnsd -X File system Disk name NSD volume ID NSD servers ------------------------------------------------------------------------------------------------ cases data_nsd043 C0A80017543D01BC gpfs03.gpfsint.labo,gpfs04.gpfsint.labo,gpfs01.gpfsint.labo,gpfs02.gpfsint.labo cases data_nsd044 C0A80018543CE5A2 gpfs04.gpfsint.labo,gpfs01.gpfsint.labo,gpfs02.gpfsint.labo,gpfs03.gpfsint.labo cases data_nsd045 C0A80017543D01C3 gpfs01.gpfsint.labo,gpfs02.gpfsint.labo,gpfs03.gpfsint.labo,gpfs04.gpfsint.labo cases data_nsd046 C0A80018543CE5A8 gpfs02.gpfsint.labo,gpfs03.gpfsint.labo,gpfs04.gpfsint.labo,gpfs01.gpfsint.labo cases data_nsd047 C0A80017543D01C9 gpfs03.gpfsint.labo,gpfs04.gpfsint.labo,gpfs01.gpfsint.labo,gpfs02.gpfsint.labo