Welcome to the new DiscussCESM forum!
We are still working on the website migration, so you may experience downtime during this process.

Existing users, please reset your password before logging in here: https://xenforo.cgd.ucar.edu/cesm/index.php?lost-password/

How to use mpirun to run clm In parallel?

jack

jack
Member
Hi,
I'm using a supercomputer to run clm5.0 and have set ./xmlchange NTASKS=64 and
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=16 (max per node)

in a slurm sbatch file.
However, I found the running speed of 16 ntasks (i.e., nodes=1) is the same or even faster than 64 ntasks (i.e., nodes=4),
the mpirun I'm using is mpich-3.3.1, it was loaded locally and wasn't moduled by the supercomputer moduleing system.
Does anybody know how to use more cores tu run clm5.0? Thanks
 

erik

Erik Kluzek
CSEG and Liaisons
Staff member
Hmmm. That is how you set the number of tasks in the system. It doesn't make sense to me that you would see lower number of tasks run faster at such a low number of processors. Because of communication costs we do expect that eventually, but I'd only expect that for many thousands of processors.

There always is variability in systems when you run though also. So you might do several simulations at the same number of tasks to make sure you have a representative test.

Since, this is a question about the overall system of running CESM (for the specific case of I compsets) I'm moving this to the general forum.
 

jack

jack
Member
thanks, erik. The maximum processors I can use is 128 (I paid for it), but it seems like only 16 porcessors plays a role. I suspect that the setting of my cesm (or mpirun) is wrong, do you have any suggestions?
Hmmm. That is how you set the number of tasks in the system. It doesn't make sense to me that you would see lower number of tasks run faster at such a low number of processors. Because of communication costs we do expect that eventually, but I'd only expect that for many thousands of processors.

There always is variability in systems when you run though also. So you might do several simulations at the same number of tasks to make sure you have a representative test.

Since, this is a question about the overall system of running CESM (for the specific case of I compsets) I'm moving this to the general forum.
 

jack

jack
Member
I have set
<MAX_TASKS_PER_NODE>64</MAX_TASKS_PER_NODE>
<MAX_MPITASKS_PER_NODE>64</MAX_MPITASKS_PER_NODE>
in config_machines.xml
and

<batch_system type="slurm" MACH="jack">
<batch_submit>sbatch</batch_submit>
<submit_args>
<arg flag="--time" name="$JOB_WALLCLOCK_TIME"/>
<arg flag="-p" name="$JOB_QUEUE"/>
<arg flag="--account" name="$PROJECT"/>
</submit_args>
</batch_system>
in config_batch.xml

The final sbatch file is as follows:
#!/bin/bash

#SBATCH --partition=hpib
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=16
#SBATCH --account=jack

module load scl/gcc8.3
module load mpich/3.4.1
echo myjob.sbatch start on $(date)
cd /home/jack/dat01/clm5.0/cime/scripts/mpi_test
./case.submit
echo myjob.sbatch end on $(date)
 
Top