...
https://docs.aws.amazon.com/parallelcluster/latest/ug/intelmpi.html
Go to the Cluster_Distribution folder
tar -xvf l_mpi_2019.6.166.tgz
Code Block |
---|
tar -xvf l_mpi_2019.6.166.tgz
cd l_mpi_2019.6.166 |
Run the install script as:
Code Block |
---|
./install.sh |
Press enter and read the agreement and type ‘accept’
Install on the default - (Single - Node)
The default install location will be
/home/ubuntu/intel
Lets change that to /fsx/intel
Enter the following for Customize Installation:
Code Block |
---|
2 |
Customize Installation
Enter (Change install Directory ):
Code Block |
---|
2 |
type:
Code Block |
---|
/fsx/intel |
You should see this updated on the prompt above
Now press enter to continue
The installation process should be complete and you will see the install related files under:
/fsx/intel
Now to ensure the MPI related exectuables are available in the path you need so ‘source’ the relevant intel setup scripts. An example of this is given in the sample_bashrc file under the Cluster_Distribution
Code Block |
---|
source /fsx/intel/bin/compilervars.sh -arch intel64
source /fsx/intel/impi/2019.6.166/intel64/bin/mpivars.sh |
To ensure these commands are executed every time you log into the cluster add the two lines to your .bashrc file.
If you make the changes to your bashrc file you should go ahead and source that file
Code Block |
---|
source ~/.bashrc |
Now, lets see if that put the new executables in your path. Enter:
Code Block |
---|
which mpiexec |
You should see:
Code Block |
---|
/fsx/itel/compilers_and_libraries_2020.0.166/linux/mpi/intel64/bin/mpiexec |
Setting the Path for the OpenMP Runtime libraries
OpenMP requires shared runtime libraries that are accessed by each thread. These .so libraries are not shipped with the MPI distribution. They are available in the Cluster_Distribution folder under intel64_lin
To make these .so libraries available at run time you need to modify the LD_LIBRARY_PATH. This is best accomplished by appending the .bashrc file. Simply add:
Code Block |
---|
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/fsx/Cluster_Distribution/intel64_lin |
Remember to source the .bashrc
Verify the LD_LIBRARY_PATH was modified
Code Block |
---|
echo $LD_LIBRARY_PATH |
You should see the /fsx/Cluster_Distribution/intel64_lin
at the end of your path
Running EFDC+ from the Command Line on Cluster Systems
Some notes on executing a run from the command line
Generally:
Code Block |
---|
mpiexec -n (# processes) -ppn (# processes per node) -hosts host1,host2 -genv I_MPI_DEBUG=5 -genv I_MPI_PIN_DOMAIN=omp -genv OMP_NUM_THREADS=(# threads) efdc+.exe -NT(# threads) |
For example, if I wanted to run with a domain decomposed into 32 subdomains across 2 nodes with 2 threads/process I could execute the following script
Code Block |
---|
mpiexec -n 32 -ppn -hosts node1,node2 -genv I_MPI_DEBUG=5 -genv I_MPI_PIN_DOMAIN=omp -genv OMP_NUM_THREADS=2 efdc+.exe -NT2 |
This would run 16 processes on node1 and 16 processes on node2 with 2 threads per process. So each node would be utilizing 32 cores during a run.
I think its easiest to think of a process as the "master" thread and any additional threads work with that master thread to provide additional calculation capability.
NOTE - to get the node names on AWS enter the command:
Code Block |
---|
$ qnodes |
it should give the node name listed somewhere along with other information about that node.