When running a BHIST_f09g17 case with CESM2.1.4, the run gets stuck for many minutes in the early atmosphere initialization stage. The log output halts around species setup and block redistribution, and the model does not proceed past creating gsmap_cx for atm.
No apparent error messages are printed — the program simply appears to stall.
The configuration is:
NTASKS: ['CPL:448', 'ATM:448', 'LND:224', 'ICE:224', 'OCN:56', 'ROF:224', 'GLC:448', 'WAV:448', 'ESP:1']
TOTALPES: 504
NTHRDS: ['CPL:1', 'ATM:1', 'LND:1', 'ICE:1', 'OCN:1', 'ROF:1', 'GLC:1', 'WAV:1', 'ESP:1']

ATM log or screen shows last printed species during initialization:
39 so4_a3&IC kg/kg 32 I so4_a3
40 soa_a1&IC kg/kg 32 I soa_a1
41 soa_a2&IC kg/kg 32 I soa_a2
CESM output shows:
1 IMOD, NAPROC, NBLKRS, NSPEC, RSBLKS = 1 448 0 600 0
2 IMOD, NAPROC, NBLKRS, NSPEC, RSBLKS = 1 448 2 600 5
The cluster has 56 cores and 22 nodes, and my case uses 9 nodes. The program stalls at the MCT coupling grid creation (creating gsmap_cx for atm) and no MPI or system errors printed and Slurm shows normal RUNNING state, it happens specifically at ATM chemical initialization. Moreover, I haven't modified the run configuration and namelist; I only set the runtime to one month.
What do I need to set up to make it run successfully?
No apparent error messages are printed — the program simply appears to stall.
The configuration is:
NTASKS: ['CPL:448', 'ATM:448', 'LND:224', 'ICE:224', 'OCN:56', 'ROF:224', 'GLC:448', 'WAV:448', 'ESP:1']
TOTALPES: 504
NTHRDS: ['CPL:1', 'ATM:1', 'LND:1', 'ICE:1', 'OCN:1', 'ROF:1', 'GLC:1', 'WAV:1', 'ESP:1']

ATM log or screen shows last printed species during initialization:
39 so4_a3&IC kg/kg 32 I so4_a3
40 soa_a1&IC kg/kg 32 I soa_a1
41 soa_a2&IC kg/kg 32 I soa_a2
CESM output shows:
1 IMOD, NAPROC, NBLKRS, NSPEC, RSBLKS = 1 448 0 600 0
2 IMOD, NAPROC, NBLKRS, NSPEC, RSBLKS = 1 448 2 600 5
The cluster has 56 cores and 22 nodes, and my case uses 9 nodes. The program stalls at the MCT coupling grid creation (creating gsmap_cx for atm) and no MPI or system errors printed and Slurm shows normal RUNNING state, it happens specifically at ATM chemical initialization. Moreover, I haven't modified the run configuration and namelist; I only set the runtime to one month.
What do I need to set up to make it run successfully?