Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Trying to run F_2000_WACCM...

Greetings,

I configured CESM1_1 for the F_2000_WACCM compset on an Intel Linux cluster (similar to, but smaller than pleiades_wes). Compiles OK, but aborts soon after the run is initiated, producing atm, cesm, and cpl.log files. An excerpt of the *.log files is reproduced below for an 8-pes case:

Does this offer any clues as to what to look for next? Thank you!

rotor[123]> tail -n 30 *log.140624-092340
==> atm.log.140624-092340 ccsm.log.140624-092340 cpl.log.140624-092340
 

jedwards

CSEG and Liaisons
Staff member
It's telling you what to try next:MPI has run out of internal datatype entries.
Please set the environment variable MPI_TYPE_MAX for additional space.
The current value of MPI_TYPE_MAX is 100000
MPI has run out of internal datatype entries.
Please set the environment variable MPI_TYPE_MAX for additional space.
The current value of MPI_TYPE_MAX is 100000
 
I didn't think it would be that simple. For example, doubling MPI_TYPE_MAX yielded the same result (see below). Could the MPI-related message be incidental to some other issue? If I knew what output to expect next, for example, I might then be in a better position to troubleshoot. Thanks.-------------------------------------- TASK#  NAME  0  n001  1  n001  2  n001  3  n001  4  n001  5  n001  6  n001  7  n001 Opened existing file  /rotor/data/gmodica/inputdata/atm/waccm/ic/f2000.e10r02.2deg.waccm.005.cam2.i.0 017-01-01-00000.nc       65536 Opened existing file  /rotor/data/gmodica/inputdata/atm/cam/topo/USGS-gtopo30_1.9x2.5_remap_c050602.n c      131072 Opened existing file  /rotor/data/gmodica/inputdata/atm/waccm/lb/LBC_1765-2005_1.9x2.5_CMIP5_za_c1111 10.nc      196608MPI has run out of internal datatype entries.Please set the environment variable MPI_TYPE_MAX for additional space.The current value of MPI_TYPE_MAX is 200000MPI has run out of internal datatype entries.Please set the environment variable MPI_TYPE_MAX for additional space.The current value of MPI_TYPE_MAX is 200000MPI: MPI_COMM_WORLD rank 1 has terminated without calling MPI_Finalize()MPI: aborting job/opt/sgi/mpt/mpt-2.02/bin/mpiexec_mpt: line 53: 28540 Killed                  $mpicmdline_prefix -f $paramfile 
 

eaton

CSEG and Liaisons
I'd try doubling MPI_TYPE_MAX again.  My experience is that a specific message from the MPI lib is robust.  I wouldn't be surprised though if a single node doesn't have enough memory to run a 2-deg WACCM configuration. 
 
Dear jedwards and eaton,Indeed, increasing MPI_TYPE_MAX to 500000 led to the run progressing considerably further before reaching a similar conclusion in the computation for CLM. I will try increasing again. Thanks.
 
Top