Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Running CESM in parallel

TCNasa

Tom Caldwell
Member
Anytime I try to specify NTHREADS greater than 1. My CESM code crashes with errors like this.
Any suggestions?

--------------------------------------------------------------------------


Primary job terminated normally, but 1 process returned


a non-zero exit code. Per user-direction, the job has been aborted.


--------------------------------------------------------------------------


--------------------------------------------------------------------------


orterun noticed that process rank 2 with PID 0 on node bn12 exited on signal 11 (Segmentation fault).


--------------------------------------------------------------------------


[bn12:10058] PMIX ERROR: NO-PERMISSIONS in file dstore_base.c at line 237


[bn12:10058] PMIX ERROR: NO-PERMISSIONS in file dstore_base.c at line 246
 

jedwards

CSEG and Liaisons
Staff member
A couple of answers
1. CESM does not perform as well with openmp threading as with mpi - whenever possible you should use mpi for parallelism instead of openmp
2. How threading is set up in the config_batch.xml file is very machine dependent, if possible seek help from a local system administrator.

Is this happening before any cesm.log file is written? Can you run a simple test with threading - for example - hello world?
 

TCNasa

Tom Caldwell
Member
Is there a setting to switch from openmp to mpi?
I don't know precisely when it happens. The error messages are from the cesm log file.
 

TCNasa

Tom Caldwell
Member
bn12:/CERES/sarb/caldwell/CESM2.2 {187} ./describe_version


------------------------------------------------------------------------


git describe:


cesm2.2.0-0-g332937b


------------------------------------------------------------------------




These commands are used to set the task and thread values:
./xmlchange --id MAX_TASKS_PER_NODE --val 2
./xmlchange --id MAX_MPITASKS_PER_NODE --val 2
./xmlchange NTHRDS=2,NTASKS=6

The browser won't let me choose the log files for upload
 

TCNasa

Tom Caldwell
Member
Yes a run with NTHREADS=1 works, but I can't increase the number of TASKS over 12 without failure either.
 
Top