SCIAMA
High Performance Compute Cluster
Exercise: Compile an MPI executable and submit an MPI job
Introduction
In this example we will compile an MPI executable in an interactive shell and then submit a job to the batch queue.
Exercise
Create an interactive shell with 6 cores allocated by executing the following command:
You should see something similar to:
Waiting for JOBID 1691162 to start
......
[train1@node247(sciama) ~]$
All allocated cores are not necessary on the same node as the shell you are provided with now. Try to find out if/which other nodes are used as well (hint: squeue).
Now select the modules you will need for this exercise:
From the command line confirm that all the requested modules have been successfully loaded (hint: module list).
Now change into the “$HOME/training/src” directory and look at the file "mpi.c" . Try to understand the structure of the code.
We will now compile this file and create an executable to be stored in the "bin" directory:
Ignore any warnings. Check that you now have an mpi.exe in the bin directory.
We will now run the program using all 6 cores we have allocated:
On slurm, srun should be used instead of mpirun that usually does the bootstrapping of the processes across the allocated nodes. Also notice that we haven’t told srun explicitly to use our 6 cores. If no number of tasks is specified, it uses all allocated cores automatically.
The output should be similar to:
Hello world from process 0 of 6
Hello world from process 1 of 6
Hello world from process 2 of 6
Hello world from process 3 of 6
Hello world from process 4 of 6
Hello world from process 5 of 6
Now exit from the interactive shell back onto the login node. Check that you are in your home directory (cd).
We will now submit the same program as a batch job. Again, using the knowledge from the previous exercises, try to write your own batch script to do so and store it in "training/scripts/" (hint: if you get stuck, you can find a commented solution in "training/scripts/mpi-job.sh.solution"). Then submit the job:
Use squeue to check the status of the job. When the job has completed locate and examine the output file.
The contents of the output file should be similar to the interactive output.