SCIAMA
High Performance Compute Cluster
SLURM MPI Batch Script Example
Since the OS and SLURM update mpi jobs have not been working on sciama2.q due to changes in the way SLURM handles MPI requests. Below is an example of a Slurm batch script to be used for mpi jobs on SCIAMA.
#!/bin/bash
#SBATCH --job-name=mpi_job
#SBATCH --ntasks=128
#SBATCH --time=100:00:00
#SBATCH --output=log.%j
#SBATCH --partition=sciama2.q
module purge
module load system/intel64
module load intel_comp/2019.2
module load openmpi/4.1.4
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
export OMPI_MCA_mtl=^psm
srun --mpi=pmi2 mpi.exe
#SBATCH --job-name=mpi_job
#SBATCH --ntasks=128
#SBATCH --time=100:00:00
#SBATCH --output=log.%j
#SBATCH --partition=sciama2.q
module purge
module load system/intel64
module load intel_comp/2019.2
module load openmpi/4.1.4
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
export OMPI_MCA_mtl=^psm
srun --mpi=pmi2 mpi.exe
Note that the number of nodes is not set, allow slurm to manage the number of nodes needed, you may need more nodes for memory than cores.
We now need to tell OPENMPI to use the Infiniband interconnect with OMPI_MCA_mtl=^psm.
Do not use "srun -N or -n" unless specifically required, you have already told it to use 128 cores.