Hello.
I am using CESM 2.1.5 and am having difficulty getting cesm.exe to run as ported to my machine. I have not made any code changes or namelist changes.
I created the case with:
./create_newcase --case f2000climo_testcase --compset F2000climo --res f09_f09_mg17 --mach lake
I have set NTASKS=32 and NTASKS_ESP=1 based on other threads I have read where the number of tasks needs to be less than the total number of cores to see if that resolved the issue (but alas it did not). I am running on 8 nodes with 32 cores each, and made sure to set max tasks per core to 32 in my machine config file (attached).
I run case setup, preview namelists, check input data with download option, then build the case. The code builds successfully without errors.
I can successfully submit the job to our PBS queuing system (file attached). The job will run on a cluster of intel xeon gold processors, and the PBS job script has the execution line of:
mpirun --report-bindings --bind-to core --map-by socket:PE=1 -n 256 -N 32 bld/cesm.exe
The job runs and uses all of the requested 256 cores. But there are no log files generated in the run directory and the cesm.exe just runs endlessly (I have tried up to 10 hours and it does not finish). There are no errors output in the PBS logfile generated when the job is accepted and run by the queuing system.
If there are build steps I am missing or other porting steps I have omitted please offer recommendations, checks and steps to follow. Any help or support is greatly appreciated.
Thank you
I am using CESM 2.1.5 and am having difficulty getting cesm.exe to run as ported to my machine. I have not made any code changes or namelist changes.
I created the case with:
./create_newcase --case f2000climo_testcase --compset F2000climo --res f09_f09_mg17 --mach lake
I have set NTASKS=32 and NTASKS_ESP=1 based on other threads I have read where the number of tasks needs to be less than the total number of cores to see if that resolved the issue (but alas it did not). I am running on 8 nodes with 32 cores each, and made sure to set max tasks per core to 32 in my machine config file (attached).
I run case setup, preview namelists, check input data with download option, then build the case. The code builds successfully without errors.
I can successfully submit the job to our PBS queuing system (file attached). The job will run on a cluster of intel xeon gold processors, and the PBS job script has the execution line of:
mpirun --report-bindings --bind-to core --map-by socket:PE=1 -n 256 -N 32 bld/cesm.exe
The job runs and uses all of the requested 256 cores. But there are no log files generated in the run directory and the cesm.exe just runs endlessly (I have tried up to 10 hours and it does not finish). There are no errors output in the PBS logfile generated when the job is accepted and run by the queuing system.
If there are build steps I am missing or other porting steps I have omitted please offer recommendations, checks and steps to follow. Any help or support is greatly appreciated.
Thank you