Hello, I am porting CESM2 (2.1.3 cime5.6) to a new platform with the main variation being the job scheduler is SGE. After some success with config_batch.xml I have found that the cesm run fails. the build should be using Intel 19.0.6 and intelmpi, each node has 40 cores. I was running the smoke test : SMS.f19_g17.X.arc4_intel.20210809_170336_jrngry and I also tried scripts_regression_tests.py. I wonder if I should try openmpi or the MVAPICH2 that are also available.
In the "run" directory there is a log:
[earmgr@login2.arc4 run]$ more cesm.log.210809-172625
Invalid PIO rearranger comm max pend req (comp2io), 0
Resetting PIO rearranger comm max pend req (comp2io) to 64
PIO rearranger options:
comm type =
p2p
comm fcd =
2denable
max pend req (comp2io) = 0
enable_hs (comp2io) = T
enable_isend (comp2io) = F
max pend req (io2comp) = 64
enable_hs (io2comp) = F
enable_isend (io2comp) = T
(seq_comm_setcomm) init ID ( 1 GLOBAL ) pelist = 0 0 1 ( npes = 1) ( n
threads = 1)( suffix =)
Abort(537497356) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Group_range_incl: Invalid argume
nt, error stack:
PMPI_Group_range_incl(200)........: MPI_Group_range_incl(group=0x88000006, n=1, ranges=0x117c900, n
ew_group=0x7fffe35d1214) failed
MPIR_Group_check_valid_ranges(331): The 0th element of a range array ends at 39 but must be nonnega
tive and less than 1
In the "run" directory there is a log:
[earmgr@login2.arc4 run]$ more cesm.log.210809-172625
Invalid PIO rearranger comm max pend req (comp2io), 0
Resetting PIO rearranger comm max pend req (comp2io) to 64
PIO rearranger options:
comm type =
p2p
comm fcd =
2denable
max pend req (comp2io) = 0
enable_hs (comp2io) = T
enable_isend (comp2io) = F
max pend req (io2comp) = 64
enable_hs (io2comp) = F
enable_isend (io2comp) = T
(seq_comm_setcomm) init ID ( 1 GLOBAL ) pelist = 0 0 1 ( npes = 1) ( n
threads = 1)( suffix =)
Abort(537497356) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Group_range_incl: Invalid argume
nt, error stack:
PMPI_Group_range_incl(200)........: MPI_Group_range_incl(group=0x88000006, n=1, ranges=0x117c900, n
ew_group=0x7fffe35d1214) failed
MPIR_Group_check_valid_ranges(331): The 0th element of a range array ends at 39 but must be nonnega
tive and less than 1